gcc-6 linpack.c cpuidc.c -Wa,-march=armv8-a -lm -lrt -O3 -march=armv8-a -o linpackPi64
The first impression was that measured speeds were exceptionally inconsistent. I tried SUSE Linux Enterprise Server (same installation procedure) and that produced the same variable performance. Examples below are for the Linpack benchmark. The source code is essentially the same as Linpack-pc.c, that I submitted to Netlib 20 years ago. This was specifically designed to produce consistent speeds, unaffected by timer resolution. The program executes the same code ten times, using the same pass count, the second half using slightly different memory addressing. The benchmark has been run without this inconsistency, at 32 bits and 64 bits, where appropriate, on PCs from DOS to Windows 10, various Linux distros, including OpenSuse 11.3, Android devices, and RPis at 32 bits.
Below are the first set of five Raspberry Pi 3 measurements for 32 bit operation via Raspbian Operating System and at 64 bits using SUSE. The first results demonstrate constant MFLOPS performance, as expected, but SUSE speeds appear to be far too variable. I have run my Android version on a Tablet, using a slightly faster Cortex-A53 CPU, with both 32 bit and 64 bit compilations. Resultant average MFLOPS were 178 and 348 respectively.
Code: Select all
RPi 3 Raspbian
dgefa dgesl total Mflops unit ratio
0.00374 0.00012 0.00386 177.81 0.0112 0.069
0.00374 0.00012 0.00386 177.83 0.0112 0.069
0.00374 0.00012 0.00386 177.81 0.0112 0.069
0.00374 0.00012 0.00386 177.81 0.0112 0.069
0.00374 0.00012 0.00386 177.81 0.0112 0.069
Average 177.81
RPI3 SuSe
dgefa dgesl total Mflops unit ratio
0.00340 0.00006 0.00346 198.33 0.0101 0.062
0.00363 0.00013 0.00376 182.80 0.0109 0.067
0.00191 0.00006 0.00197 348.54 0.0057 0.035
0.00186 0.00006 0.00192 357.37 0.0056 0.034
0.00407 0.00013 0.00420 163.36 0.0122 0.075
Average 250.08
CPU Utilisation - 25 x 1 second samples were produced, with results essentially the same as shown, whilst the benchmark was running, with a constant 100% indicated for the chosen CPU core.
Temperature - This was measured using watch at 1 second intervals, to show changes in a fixed window position. This rose by no more than 4°C from the temperature shown below.
Power Supply - I have a meter that measures power supply voltage and current, indicating 5.14 to 5.15 volts and less than 0.5 amps.
CPU frequency - This varied, switching between 1200 MHz and 600 MHz both when the program was running, no doubt the cause of varying performance. This behaviour was repeated when the CPU was not running benchmarks.
Raspbian - On running the 32 bit benchmark via Raspbian, CPU frequency was constantly 1200 MHz and 600 MHz with nothing running. When the CPU is overheating, CPU MHz is reduced, the frequency being used is identified by the command vcgencmd measure_clock arm. This command is not available via SUSE.
Code: Select all
mpstat -P ALL 1 25
CPU %usr %nice %sys %iowait %idle
all 26.38 0.00 0.25 0.00 73.37
0 100.00 0.00 0.00 0.00 0.00
1 0.00 0.00 1.00 0.00 99.00
2 0.00 0.00 0.00 0.00 100.00
3 4.95 0.00 0.00 0.00 95.05
--------------------------------------------------------
sensors - watch -n 1 sensors
bcm2835_thermal-virtual-0
Adapter: Virtual device
temp1: +47.2°C
--------------------------------------------------------
cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq
1200000 Each 600000 sometimes
1200000
1200000
1200000
On experimenting with watch sampling interval, I changed the command to, for example, watch -n 0.1 sensors, when repeated runs showed consistent performance, with occasional minor variation. Below shows results of the ten sets of calculations.
I will upload the 64 bit benchmarks in due course, obtainable (free with no ads) via my web site in the Raspberry Pi pages:
http://www.roylongbottom.org.uk/
Is there anything that I have missed?
Code: Select all
RPI3 SuSe with watch -n 0.1 sensors
Times for array with leading dimension of 201
dgefa dgesl total Mflops unit ratio
0.00193 0.00006 0.00199 345.48 0.0058 0.0355
0.00192 0.00006 0.00198 346.67 0.0058 0.0354
0.00192 0.00006 0.00198 346.40 0.0058 0.0354
0.00192 0.00006 0.00198 346.57 0.0058 0.0354
0.00192 0.00006 0.00198 346.51 0.0058 0.0354
Average 346.30
Times for array with leading dimension of 200
dgefa dgesl total Mflops unit ratio
0.00180 0.00006 0.00186 369.72 0.0054 0.0332
0.00179 0.00006 0.00186 370.15 0.0054 0.0331
0.00179 0.00006 0.00186 370.09 0.0054 0.0331
0.00179 0.00006 0.00185 370.78 0.0054 0.0331
0.00179 0.00006 0.00186 370.07 0.0054 0.0331
Average 370.16