Python is a perfect language for creating simple PoC projects. We will not talk about the full list of Python’s advantages, but the most amazing one is that Python is cross platform. This feature is quite useful for building embedded system applications. No need to wait until a compiler builds binaries, no need to deploy applications to the board. And the same code runs on a PC desktop as well as on Linux-based boards like Raspberry Pi.

However, this approach has its limits. It cannot be used for some hardware-related projects, e.g. a PC desktop doesn’t have an SPI. And obviously ARM-based boards are slower than a PC. So in some cases an algorithm perfectly running on a desktop experiences a lack of performance on an embedded system.

Performance was critical in one of our latest projects (watch Amazon Alexa Virtual Device demo here). We used a Dragonboard 410c as one of target platforms. We were pleasantly surprised with the board’s performance. Even without instrumental benchmarking we felt a significant boost in OS packages installations, application launch, and overall performance compared to Raspberry Pi 3. Let’s have a look at board specs:

HardwareRaspberry Pi 3Dragonboard 410c
CPUBCM2837, Quad-core ARM®A53(v8) 1.2GHzSnapdragon 410E, Quad-core ARM®A53(v8) 1.2GHz
Memory1GB1GB

 

Looks quite equal and it doesn’t explain the superior performance of DragonBoard. So we performed a benchmark test to analyze if it’s really faster.
We decided to measure Python 3’s performance on these boards as the most interesting version for us. We used Python Performance Benchmark Suite (http://pyperformance.readthedocs.io/) that claims that it focuses on real-world benchmarks rather than synthetic benchmarks using whole applications where possible.
We used our favourite OS for embedded platforms Ubuntu Core 16 (https://www.ubuntu.com/core). This operating system is consistent and provides transactional updates. And most importantly, it supports both boards for our benchmark testing. We installed Ubuntu Core on an SD card and used the ‘classic’ environment to perform the test.

The following command installs Pyperformance:

pip3 install pyperformance

We ran it with:
echo pyperformance run --python=python3 -o result.json | nohup sudo classic &

We used this tricky command just to allow the test to run in the background without an active ssh connection since it takes time and to avoid any issues due to a lost ssh connection (the test would also be stopped).
The results are visualized in the graph below.

The results are really surprising. DragonBoard is almost twice faster than Raspberry Pi 3! Now we understand why we felt that performance boost. Of course the Raspberry Pi 3 official kernel uses the armhf (v7) architecture, which is 32 bits, while DragonBoard 410c has the amr64(v8) kernel with 64 bits. Since Python uses 64 bit arithmetics a lot, probably it explains the difference.
And just to confirm this result with the native code, let’s run the ‘sysbench’ test, too. Sysbench has a few tests that we would like to run: cpu, memory, threads, mutex. These tests are more synthetic, but let’s compare the calculation power of the boards anyway. The results are:

TestRaspberry Pi 3DragonBoard 410c
cpu
sysbench --test=cpu run
318.1229s12.6500s
memory
sysbench --test=memory --memory-total-size=2G run
7.5322s3.0507s
threads
sysbench --test=threads run
23.1469s9.1600s
mutex
sysbench --test=mutex run
0.0283s0.0141s

The CPU benchmark is 25 times faster. That is because the CPU test uses explicit 64 bit integers. And that is a scenario where a 64 bit OS has a really big advantage. Other tests are approximately twice as fast, which is quite similar to pybenchmark. Even though the threads test should not be affected by the 64 bit core.

Of course, there are always opportunities to develop your project using C or even pure Assembler and get better performance. But usually the development time is more expensive than the price of the board. It’s simply cheaper, effective, and more productive to use tools like Python that help to develop a better quality product and deliver it to market faster. The performance decrease caused by Python can be easily overcome by a more powerful board. We really hope that the Raspberry Pi official image will have a 64 bit OS. But for now DragonBoard 410c is a perfect choice for a Python application development.