User avatar
scruss
Posts: 5286
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON

Re: A Pi Pie Chart

Fri Oct 29, 2021 12:46 am

the things I do for you, Eric ...
  • 32 bit: 580 mA
  • 64 bit: 730 mA
The latter is much higher than I expected.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Fri Oct 29, 2021 3:55 pm

scruss wrote:
Fri Oct 29, 2021 12:46 am
the things I do for you, Eric ...
  • 32 bit: 580 mA
  • 64 bit: 730 mA
The latter is much higher than I expected.
I find it amusing that 580 mA is the same value Tom's Hardware reports in

https://www.tomshardware.com/uk/reviews ... 2-w-review

for the Pi Zero 2 is under load. I wonder if they ran the same Lorenz 96 stress test to get that number.

It seems Tom missed the bigger news that 64-bit takes

730/580=1.26

times more power while delivering

1207.32/705.102=1.71

times more performance than 32-bit mode.

Unless the additional power budget blows a fuse or causes throttling, it would seem 64-bit is greener. That's not surprising as it is the native mode of operation for the cores used in the Raspberry Pi Zero 2.

User avatar
scruss
Posts: 5286
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON

Re: A Pi Pie Chart

Fri Oct 29, 2021 5:07 pm

It's greener if you keep the cores busy all the time, but most of a computer's time is spent idling, so 32-bit's (very slightly) lower idle current might win out. Also, it's relatively easy to cause thermal throttling in 64-bit mode, which will throw performance right off.

BGA packages don't have the longest life under varying thermal load, and I'm wondering if the really complex micro-wiring/BGA package that the Zero 2 W has might be even more fragile. I'd recommend experimenting, but if the other official resellers are anything like our partner company in the US, they'll have sold out of Zero 2 Ws already.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Thu Jan 06, 2022 1:26 am

scruss wrote:
Thu May 27, 2021 12:38 am
Oracle is, in fact, providing everyone with the equivalent of three Raspberry Pi computers for free
Oracle has never provided anyone anything for free. There's a catch you haven't found yet.
Weirdly, the four-core Ampere Altra instances on the Oracle cloud are still free. The advantage is all those Linux and ARM programming skills learned with a real Raspberry Pi computer are immediately transferable. I'm still looking for the catch. I wonder what it is.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Thu Jan 06, 2022 3:50 am

ejolson wrote:
Wed May 26, 2021 6:51 pm
The free tier consists of
  • 4 Ampere Altra ARM cores with 24 GB RAM.
I ran the Pi pichart program and discovered this virtual machine is the equivalent of 107 original Raspberry Pi model B computers.

For reference the output is

Code: Select all

$ ./pichart-openmp -t "4-core Altra" # Free Oracle A1 instance
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=0.269277 Mops=3469.76
Merge Sort           N=16777216 Workers=8 Sec=0.432369 Mops=931.272
Fourier Transform    N=4194304 Workers=8 Sec=0.194219 Mflops=2375.53
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.11563 Mflops=27858.1

The 4-core Altra has Raspberry Pi ratio=107.378
Making pie charts...done.
I installed gcc version 11.2 on the 4-core ARM Neoverse-N1 instance in the free-tier Oracle cloud and obtained

Code: Select all

$ ./pichart-openmp -t "4-core Altra (gcc-11.2)"
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=0.27152 Mops=3441.1
Merge Sort           N=16777216 Workers=8 Sec=0.435675 Mops=924.205
Fourier Transform    N=4194304 Workers=8 Sec=0.158582 Mflops=2909.37
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.107189 Mflops=30051.8

The 4-core Altra (gcc-11.2) has Raspberry Pi ratio=114.664
Making pie charts...done.
The updated compiler led to slightly slower integer benchmarks but slightly faster floating point with the net result being about 6 percent faster.

I wonder whether the new ARM Neoverse-N2 based Graviton 3 processors coming to the Amazon cloud will be much faster.

https://www.nextplatform.com/2021/12/02 ... rver-chip/

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Mon Jan 17, 2022 9:00 pm

ejolson wrote:
Thu May 16, 2019 5:53 pm
For reference, the runs for the other notebook computers are

Code: Select all

$ ./pichart-openmp -t "A6-9225"
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=2 Sec=0.653126 Mops=1430.55
Merge Sort           N=16777216 Workers=4 Sec=1.47106 Mops=273.717
Fourier Transform    N=4194304 Workers=4 Sec=1.05959 Mflops=435.427
Lorenz 96            N=32768 K=16384 Workers=2 Sec=0.274606 Mflops=11730.4

The A6-9225 has Raspberry Pi ratio=33.3926
Making pie charts...done.
There is a link to the current source code from the first post of this thread if you would like to make your own Pi pie charts.
I just performed a memory upgrade on that A6-9225 notebook computer from 4GB to 8GB. The new memory is rated the same speed as before, but the new Pi ratio is only 29.7. Unfortunately, the compiler and Linux distribution changed as well, so it's not a fair comparison.

Code: Select all

$ ./pichart-openmp -t "A6-9225 w/8GB"
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=2 Sec=0.652379 Mops=1432.19
Merge Sort           N=16777216 Workers=4 Sec=1.47173 Mops=273.592
Fourier Transform    N=4194304 Workers=2 Sec=1.84656 Mflops=249.856
Lorenz 96            N=32768 K=16384 Workers=2 Sec=0.252105 Mflops=12777.3

The A6-9225 w/8GB has Raspberry Pi ratio=29.6962
Making pie charts...done.
I wish I had run the benchmark once just before changing the memory as a point of reference. As I've noticed a strange slowdown with the Fourier transform on some other machines around here, it's possible something has changed with the default security mitigations or optimization.

Anyway, in addition to comparing different hardware at the same point in time, checking performance of a particular machine over time is another important use of benchmarks such as the Pi pie chart program. Could it be the fan in the laptop is now clogged with dust?

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Mon Feb 21, 2022 7:05 am

ejolson wrote:
Mon Jan 17, 2022 9:00 pm
Anyway, in addition to comparing different hardware at the same point in time, checking performance of a particular machine over time is another important use of benchmarks such as the Pi pie chart program. Could it be the fan in the laptop is now clogged with dust?
Here is a revisit to the Pi 3B+ now running the 64-bit version of Void Linux. The CPU governor was set to performance, active cooling employed and no throttling observed.

Code: Select all

$ ./pichart-openmp -t "Pi 3B+ (64-bit)"
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=1.26867 Mops=736.462
Merge Sort           N=16777216 Workers=8 Sec=1.26322 Mops=318.752
Fourier Transform    N=4194304 Workers=4 Sec=2.41423 Mflops=191.106
Lorenz 96            N=32768 K=16384 Workers=4 Sec=2.21405 Mflops=1454.9

The Pi 3B+ (64-bit) has Raspberry Pi ratio=14.1929
Making pie charts...done.
$ gcc --version
gcc (GCC) 10.2.1 20201203
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
For comparison with older results, here is the resulting Pi pie chart:

Image

Everything is faster except the merge sort, which coincidentally sorts 32-bit integers. I wonder whether sorting 64-bit integers would tell a different story.

Note also the original 32-bit runs were performed using the MIT/Intel Cilk parallel processing extensions in gcc version 6.4 which were removed in later versions of the compiler. The present runs have been performed with gcc version 10.2.1 using equivalent OpenMP calls, which unfortunately exhibit slightly different performance characteristics.

lurk101
Posts: 2165
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: A Pi Pie Chart

Tue Mar 15, 2022 9:16 pm

Pi Zero 2W - 32 bit

Code: Select all

pi@zero2w:~/piechart/pichart-36 $ ./pichart-openmp -t 'Pi Zero 2'
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=2.17265 Mops=430.04
Merge Sort           N=16777216 Workers=8 Sec=1.28051 Mops=314.448
Fourier Transform    N=4194304 Workers=8 Sec=2.6063 Mflops=177.022
Lorenz 96            N=32768 K=16384 Workers=4 Sec=4.35261 Mflops=740.068

The Pi Zero 2 has Raspberry Pi ratio=10.2443
Making pie charts...done.
piechart.png
piechart.png (48.6 KiB) Viewed 2275 times
64 bit result

Code: Select all

pi@zero2w64:~/pichart-36 $ ./pichart-openmp -t "Pi Zero 2"
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=1.80282 Mops=518.26
Merge Sort           N=16777216 Workers=8 Sec=1.61981 Mops=248.58
Fourier Transform    N=4194304 Workers=8 Sec=2.64514 Mflops=174.423
Lorenz 96            N=32768 K=16384 Workers=4 Sec=2.49811 Mflops=1289.46

The Pi Zero 2 has Raspberry Pi ratio=11.5851
Making pie charts...done.
History doesn’t repeat itself, it rarely even rhymes.

User avatar
scruss
Posts: 5286
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON

Re: A Pi Pie Chart

Wed Mar 16, 2022 4:55 pm

lurk101 wrote:
Tue Mar 15, 2022 9:16 pm
Pi Zero 2W - 32 bit
Slightly different from what I got at launch, upthread. What compiler/optimization did you use?
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

lurk101
Posts: 2165
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: A Pi Pie Chart

Wed Mar 16, 2022 10:38 pm

scruss wrote:
Wed Mar 16, 2022 4:55 pm
lurk101 wrote:
Tue Mar 15, 2022 9:16 pm
Pi Zero 2W - 32 bit
Slightly different from what I got at launch, upthread. What compiler/optimization did you use?
GCC 10.2. I hadn't seen your post!
Clang 11.0 does slightly better

Code: Select all

pi@zero2w:~/pichart-36 $ ./pichart-openmp -t 'Pi Zero 2'
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=2.10842 Mops=443.142
Merge Sort           N=16777216 Workers=4 Sec=1.38864 Mops=289.963
Fourier Transform    N=4194304 Workers=4 Sec=2.27309 Mflops=202.971
Lorenz 96            N=32768 K=16384 Workers=4 Sec=4.27075 Mflops=754.252

The Pi Zero 2 has Raspberry Pi ratio=10.516
Making pie charts...done.
History doesn’t repeat itself, it rarely even rhymes.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Thu Mar 17, 2022 5:17 pm

lurk101 wrote:
Wed Mar 16, 2022 10:38 pm
scruss wrote:
Wed Mar 16, 2022 4:55 pm
lurk101 wrote:
Tue Mar 15, 2022 9:16 pm
Pi Zero 2W - 32 bit
Slightly different from what I got at launch, upthread. What compiler/optimization did you use?
GCC 10.2. I hadn't seen your post!
Clang 11.0 does slightly better

Code: Select all

pi@zero2w:~/pichart-36 $ ./pichart-openmp -t 'Pi Zero 2'
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=2.10842 Mops=443.142
Merge Sort           N=16777216 Workers=4 Sec=1.38864 Mops=289.963
Fourier Transform    N=4194304 Workers=4 Sec=2.27309 Mflops=202.971
Lorenz 96            N=32768 K=16384 Workers=4 Sec=4.27075 Mflops=754.252

The Pi Zero 2 has Raspberry Pi ratio=10.516
Making pie charts...done.
From what I can tell clang is faster for two of the benchmark problems and slower for the other two. The rating represented by the Pi ratio suggests clang is about the same on average. Repeated runs would be needed to verify that.

lurk101
Posts: 2165
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: A Pi Pie Chart

Thu Mar 17, 2022 5:47 pm

Very close. I was just comparing the final ratios. Do you have any idea why Lorenz is significantly faster in 64-bit mode, when the others are about the same?
History doesn’t repeat itself, it rarely even rhymes.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Thu Mar 17, 2022 6:49 pm

lurk101 wrote:
Thu Mar 17, 2022 5:47 pm
Very close. I was just comparing the final ratios. Do you have any idea why Lorenz is significantly faster in 64-bit mode, when the others are about the same?
Lorenz is the only code with vectorisable floating point. While I haven't looked at the assembler, maybe in 64-bit mode the additional registers available and short vector operations play a significant role.

It's also possible that more effort was spent tuning the optimiser as people actually use 64-bit ARM for technical computing whereas 32-bit not so much.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Fri Apr 08, 2022 7:05 am

ejolson wrote:
Mon Feb 21, 2022 7:05 am
Here is a revisit to the Pi 3B+ now running the 64-bit version of Void Linux. The CPU governor was set to performance, active cooling employed and no throttling observed.
I discovered it's possible to change the CPU frequency on a Pi by selecting performance for the scaling_governor and then setting scaling_max_freq to different values. Here is a script that needs to be run as root and does this for the frequency range from 600 to 1400MHz.

Code: Select all

#!/bin/bash
echo performance >/sys/devices/system/cpu/cpufreq/policy0/scaling_governor
let i=6
while test $i -le 14
do
    echo ${i}00000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
    sleep 1
    vcgencmd measure_clock arm
    ./pichart-openmp -t ${i}00MHz >${i}00MHz.txt
    let i=i+1
done
The normalized performance versus frequency curves for a Pi 3B+ running in 64-bit mode are

Image

Note that each computational task scales differently depending on how much it relies on CPU speed versus how much it relies on memory bandwidth.

The curve for the Fourier transform is surprisingly lumpy and scales the worst. At the same time prime sieve enjoys near-linear scaling with CPU frequency--likely because it employs a huge number of bitwise operations to conserve memory.
Last edited by ejolson on Sat Apr 09, 2022 12:22 am, edited 2 times in total.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Fri Apr 08, 2022 10:02 pm

ejolson wrote:
Fri Apr 08, 2022 7:05 am
Note that each computational task scales differently depending on how much it relies on CPU speed versus how much it relies on memory bandwidth.
Comparing similar graphs for different computers shows how the relative balance between CPU speed and memory bandwidth affects the four computational tasks performed by the Pi pie chart program. To this end, I ran the same set of tests on the Pi 4B in 32-bit mode and compared the results.

Image

The biggest difference in the two graphs is for the Lorenz 96 dynamical simulation. Since the Lorenz 96 runs faster in 64-bit mode, then memory bandwidth may form a more noticeable bottleneck in that case. This is supported by the graph, which shows the Lorenz 96 curve for the Pi 3B+ running in 64-bit mode well below the corresponding curve for the Pi 4B.

If anyone could extend the Pi 4B to higher-frequency overclock settings, that would be interesting. Just post the output of the Pi chart program at each frequency (maybe best to redo the whole curve) and I'd be happy to graph the data.
Last edited by ejolson on Sat Apr 09, 2022 4:18 pm, edited 1 time in total.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Sat Apr 09, 2022 8:07 am

ejolson wrote:
Sun Feb 07, 2021 5:15 am

Code: Select all

$ ./pichart-openmp ; # Ryzen 7 Pro 1700 (8 cores)
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=16 Sec=0.119876 Mops=7794.13
Merge Sort           N=16777216 Workers=32 Sec=0.171299 Mops=2350.59
Fourier Transform    N=4194304 Workers=16 Sec=0.153614 Mflops=3003.46
Lorenz 96            N=32768 K=16384 Workers=32 Sec=0.0595906 Mflops=54055.9

My Computer has Raspberry Pi ratio=207.369
Making pie charts...done.
Here is another Ryzen result, this time for a 6-core Pro 4650G APU.

Code: Select all

$ ./pichart-openmp -t "Ryzen 5 Pro 4650G"
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=12 Sec=0.109507 Mops=8532.16
Merge Sort           N=16777216 Workers=24 Sec=0.168878 Mops=2384.28
Fourier Transform    N=4194304 Workers=12 Sec=0.171241 Mflops=2694.29
Lorenz 96            N=32768 K=16384 Workers=24 Sec=0.0373754 Mflops=86185.7

The Ryzen 5 Pro 4650G has Raspberry Pi ratio=232.791
Making pie charts...done.
The 6-core second-generation processor is faster than the first-generation 8-core in all tasks except the Fourier transform. I wonder whether differences in memory speed are responsible for that.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Wed Apr 13, 2022 8:44 pm

ejolson wrote:
Fri Apr 08, 2022 10:02 pm
If anyone could extend the Pi 4B to higher-frequency overclock settings, that would be interesting. Just post the output of the Pi chart program at each frequency (maybe best to redo the whole curve) and I'd be happy to graph the data.
Performance versus frequency results for the Pi 4B running in 64-bit mode were posted in

viewtopic.php?p=1993500#p1993500

The resulting graph looks like

Image

What's immediately noticeable is how erratic the Lorenz 96 curve appears. Although no throttling was reported by vcgencmd, it's possible there were still heat-related performance effects.

Could certain CPU frequencies mesh with the 500 MHz AXI bus better?

I wonder if the way the Lorenz 96 curve jumps up and down is repeatable. Maybe the variations are simply due to lucky or unlucky memory allocations.

Why does it do that?

User avatar
jahboater
Posts: 8608
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: A Pi Pie Chart

Thu Apr 14, 2022 9:23 am

ejolson wrote:
Wed Apr 13, 2022 8:44 pm
What's immediately noticeable is how erratic the Lorenz 96 curve appears. Although no throttling was reported by vcgencmd, it's possible there were still heat-related performance effects.
Here is a log of the temperatures during a run.
You can see the clock speed rising, but the temperature only rises a little, peaking at 56C at the 2100MHz stage.
Since throttling starts at 80C and the SoC apparently is OK up to 120C, these seem low.
The idle speed is 400MHz to save power.

Code: Select all

Time         CPU    Core    Vcore    Temp   Health
08:44:30     400     200    0.840    33.6     OK
08:45:30     600     258    0.840    33.6     OK
08:46:30     600     258    0.840    35.0     OK
08:47:30     600     258    0.840    34.0     OK
08:48:30     600     258    0.840    35.0     OK
08:49:30     600     258    0.840    35.5     OK
08:50:30     700     288    0.840    36.5     OK
08:51:30     700     288    0.840    36.5     OK
08:52:30     700     288    0.840    35.0     OK
08:53:30     700     288    0.840    36.5     OK
08:54:30     700     288    0.840    36.5     OK
08:55:30     800     317    0.840    36.0     OK
08:56:30     800     317    0.840    37.0     OK
08:57:30     800     317    0.840    36.5     OK
08:58:30     800     317    0.840    37.9     OK
08:59:30     900     346    0.840    37.9     OK
Time         CPU    Core    Vcore    Temp   Health
09:00:30     900     346    0.840    37.0     OK
09:01:31     900     346    0.840    37.9     OK
09:02:31     900     346    0.840    40.4     OK
09:03:31    1000     376    0.972    39.4     OK
09:04:31    1000     376    0.972    38.9     OK
09:05:31    1000     376    0.972    40.4     OK
09:06:31    1000     376    0.972    41.3     OK
09:07:31    1100     405    0.972    42.3     OK
09:08:31    1100     405    0.972    40.4     OK
09:09:31    1100     405    0.972    45.7     OK
09:10:31    1200     435    0.972    41.3     OK
09:11:31    1200     435    0.972    40.9     OK
09:12:31    1200     435    0.972    40.9     OK
09:13:31    1300     464    0.972    43.3     OK
09:14:31    1300     464    0.972    42.3     OK
09:15:31    1300     464    0.972    42.3     OK
Time         CPU    Core    Vcore    Temp   Health
09:16:31    1400     493    0.972    43.8     OK
09:17:31    1400     493    0.972    42.3     OK
09:18:31    1400     493    0.972    46.7     OK
09:19:31    1500     522    0.972    44.8     OK
09:20:31    1500     522    0.972    43.3     OK
09:21:31    1500     522    0.972    48.7     OK
09:22:31    1600     551    0.972    47.7     OK
09:23:31    1600     551    0.972    46.2     OK
09:24:31    1700     581    0.972    44.8     OK
09:25:31    1700     581    0.972    47.2     OK
09:26:31    1700     581    0.972    45.7     OK
09:27:31    1800     611    0.972    48.7     OK
09:28:31    1800     611    0.972    49.6     OK
09:29:31    1900     640    0.972    46.7     OK
09:30:32    1900     640    0.972    49.1     OK
09:31:32    1900     640    0.972    46.2     OK
Time         CPU    Core    Vcore    Temp   Health
09:32:32    2000     668    0.972    48.2     OK
09:33:32    2000     668    0.972    49.1     OK
09:34:32    2100     700    0.972    47.7     OK
09:35:32    2100     700    0.972    50.1     OK
09:36:32    2100     700    0.972    56.0     OK
09:37:32    2100     700    0.972    47.2     OK
09:38:32    2100     700    0.972    47.2     OK
09:39:32    2100     700    0.972    44.3     OK
09:40:32    2100     700    0.972    44.3     OK
09:41:32    2100     700    0.972    43.8     OK
09:42:32    2100     700    0.972    43.8     OK
09:43:32    2100     700    0.972    43.8     OK
09:44:32    2100     700    0.972    43.3     OK
09:45:32    2100     700    0.972    43.8     OK
09:46:32    2100     700    0.972    42.3     OK
09:47:32    2100     700    0.972    42.3     OK

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Thu Apr 14, 2022 4:00 pm

jahboater wrote:
Thu Apr 14, 2022 9:23 am
ejolson wrote:
Wed Apr 13, 2022 8:44 pm
What's immediately noticeable is how erratic the Lorenz 96 curve appears. Although no throttling was reported by vcgencmd, it's possible there were still heat-related performance effects.
Here is a log of the temperatures during a run.
You can see the clock speed rising, but the temperature only rises a little, peaking at 56C at the 2100MHz stage.
Since throttling starts at 80C and the SoC apparently is OK up to 120C, these seem low.
The idle speed is 400MHz to save power.

Code: Select all

Time         CPU    Core    Vcore    Temp   Health
08:44:30     400     200    0.840    33.6     OK
08:45:30     600     258    0.840    33.6     OK
08:46:30     600     258    0.840    35.0     OK
08:47:30     600     258    0.840    34.0     OK
08:48:30     600     258    0.840    35.0     OK
08:49:30     600     258    0.840    35.5     OK
08:50:30     700     288    0.840    36.5     OK
08:51:30     700     288    0.840    36.5     OK
08:52:30     700     288    0.840    35.0     OK
08:53:30     700     288    0.840    36.5     OK
08:54:30     700     288    0.840    36.5     OK
08:55:30     800     317    0.840    36.0     OK
08:56:30     800     317    0.840    37.0     OK
08:57:30     800     317    0.840    36.5     OK
08:58:30     800     317    0.840    37.9     OK
08:59:30     900     346    0.840    37.9     OK
Time         CPU    Core    Vcore    Temp   Health
09:00:30     900     346    0.840    37.0     OK
09:01:31     900     346    0.840    37.9     OK
09:02:31     900     346    0.840    40.4     OK
09:03:31    1000     376    0.972    39.4     OK
09:04:31    1000     376    0.972    38.9     OK
09:05:31    1000     376    0.972    40.4     OK
09:06:31    1000     376    0.972    41.3     OK
09:07:31    1100     405    0.972    42.3     OK
09:08:31    1100     405    0.972    40.4     OK
09:09:31    1100     405    0.972    45.7     OK
09:10:31    1200     435    0.972    41.3     OK
09:11:31    1200     435    0.972    40.9     OK
09:12:31    1200     435    0.972    40.9     OK
09:13:31    1300     464    0.972    43.3     OK
09:14:31    1300     464    0.972    42.3     OK
09:15:31    1300     464    0.972    42.3     OK
Time         CPU    Core    Vcore    Temp   Health
09:16:31    1400     493    0.972    43.8     OK
09:17:31    1400     493    0.972    42.3     OK
09:18:31    1400     493    0.972    46.7     OK
09:19:31    1500     522    0.972    44.8     OK
09:20:31    1500     522    0.972    43.3     OK
09:21:31    1500     522    0.972    48.7     OK
09:22:31    1600     551    0.972    47.7     OK
09:23:31    1600     551    0.972    46.2     OK
09:24:31    1700     581    0.972    44.8     OK
09:25:31    1700     581    0.972    47.2     OK
09:26:31    1700     581    0.972    45.7     OK
09:27:31    1800     611    0.972    48.7     OK
09:28:31    1800     611    0.972    49.6     OK
09:29:31    1900     640    0.972    46.7     OK
09:30:32    1900     640    0.972    49.1     OK
09:31:32    1900     640    0.972    46.2     OK
Time         CPU    Core    Vcore    Temp   Health
09:32:32    2000     668    0.972    48.2     OK
09:33:32    2000     668    0.972    49.1     OK
09:34:32    2100     700    0.972    47.7     OK
09:35:32    2100     700    0.972    50.1     OK
09:36:32    2100     700    0.972    56.0     OK
09:37:32    2100     700    0.972    47.2     OK
09:38:32    2100     700    0.972    47.2     OK
09:39:32    2100     700    0.972    44.3     OK
09:40:32    2100     700    0.972    44.3     OK
09:41:32    2100     700    0.972    43.8     OK
09:42:32    2100     700    0.972    43.8     OK
09:43:32    2100     700    0.972    43.8     OK
09:44:32    2100     700    0.972    43.3     OK
09:45:32    2100     700    0.972    43.8     OK
09:46:32    2100     700    0.972    42.3     OK
09:47:32    2100     700    0.972    42.3     OK
The temperature does seem fine. Is that a script which calls vcgencmd or your own C program for printing the stats?

If you reboot the Pi and run the tests again with a clean page allocator does the Lorenz still behave the same way at higher clock speeds? You can select to run only the Lorenz test with the -r8 option (at the expense of the Pi ratio being wrong after that).

User avatar
jahboater
Posts: 8608
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: A Pi Pie Chart

Thu Apr 14, 2022 5:06 pm

ejolson wrote:
Thu Apr 14, 2022 4:00 pm
Is that a script which calls vcgencmd or your own C program for printing the stats?
It is a home written C program that calls vcgencmd and one or two other things.
Despite spawning vcgencmd it is much faster than a shell script and lightweight enough not to artificially trigger the CPU scaling governor. The next refinement is to use the mailboxes directly, but this works. Same usage as vmstat.

Code: Select all

/*
 *  Raspberry Pi System Monitor
 *
 *  This code is lightweight to avoid triggering the frequency scaling governor.
 *
 *  It would be preferable to use the mailbox directly. However compared to the shell
 *  script this C version does one less call to vcgencmd (collects CPU and CORE in one go),
 *  removes the script overheads, and improves the presentation.
 *  Worst case execution time per sample (700MHz Pi1) 87ms.
 *
 *  USAGE:  pistat [delay [count]]
 *  Delay and count behave as vmstat see "man vmstat" for info.
 *
 *  LIMITS (without simple code change)
 *  MEMORY >= 256MB && MEMORY <= 128GB
 *  CPU >= 100MHz && CPU < 10GHz
 *  CORE >= 100MHz && CORE < 1GHz
 *  TEMP >= 10C && TEMP < 100C
 *  VOLTS < 10.0
 *
 *  stress-ng --cpu 0 --cpu-method fft
 *
 *  Last delta: 22/04/2022  (record min and max temperatures)
 */

#define _GNU_SOURCE 1

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <time.h>
#include <errno.h>

#define chr(c) (char)((c) | 48)
#define scan(p,r,c) ({ do --r; while( *p++ != c && r ); p[-1] == c; })
#define result(st) fgets(in, IOSIZE, fp); s = in + st
#define collect(com,st) fp = popen("vcgencmd " #com, "r"); result(st)
#define inline __inline__ __attribute__((always_inline))
#define error(s) { write(2, s "\n", sizeof(s)); _exit(1); }

typedef uint16_t u16;
typedef uint32_t u32;
typedef double real;

#define IOSIZE 1024

static inline char *
utoa( char *p, u32 n )
{
  do
    *--p = chr(n % 10);
  while( n /= 10 );
  return p;
}

/*
 *  Get the Pi description including the memory size
 *  For example: Raspberry Pi 4 Model B Rev 1.4 8GB
 */
static char *
model( char * const in, char * const buf )
{
  const int fd = open("/proc/device-tree/model", O_RDONLY);
  if( fd < 0 )
    return in;
  char *tail, *p = in + read(fd, in, IOSIZE);
  close(fd);

  /*
   *  Get the Pi's memory size.
   *  Also properly validate vcgencmd (no checks are done for the periodic update).
   */
  errno = 0;
  FILE *fp = popen("vcgencmd get_config total_mem", "r");
  if( fp == NULL || errno )
    error("cannot run vcgencmd");
  fgets(buf, IOSIZE, fp);
  pclose(fp);
  if( memcmp(buf, "total_mem=", 10) )
    error("total_mem field missing");
  u32 mem = (u32)strtoul(buf + 10, &tail, 10);
  if( *tail != '\n' || mem < 256 || mem > 128*1024 || errno )
    error("total_mem field invalid");
  char *e = buf + 8;
  if( mem < 1024 )
    *e = 'M';
  else
  {
    mem /= 1024;
    *e = 'G';
  }
  e[1] = 'B';
  e = utoa(e, mem);
  *p++ = ' ';
  p = mempcpy(p, e, (size_t)((buf + 10) - e));
  fp = popen("vcgencmd get_mem gpu", "r");
  fgets(buf, IOSIZE, fp);
  pclose(fp);
  e = memchr(buf + 4, '\n', IOSIZE - 4);
  p = mempcpy(mempcpy(p, " (gpu ", 7), buf + 4, (size_t)(e - (buf + 4)));
  return mempcpy(p, "B)\n", 3);
}

  // Stats so far (time spent at each CPU frequency)
static char *
time_in_state( char *p, char *buf )
{
  const int fd = open("/sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state", O_RDONLY);
  if( fd < 0 )
    return p;
  const ssize_t file_len = read(fd, buf, IOSIZE);
  close(fd);
  bool align = false;
  ssize_t rem = file_len;
  for( char *t = buf, *ptr = buf; rem > 0 && scan(ptr, rem, '\n'); t = ptr )
  {
    if( t[6] != ' ' )
    {
      align = true;
      break;
    }
  }
  rem = file_len;
  for( char *ptr = buf; rem > 0 && scan(ptr, rem, '\n'); buf = ptr )
  {
    size_t len = 7;
    if( buf[6] == ' ' )
    {
      if( align )
	*p++ = ' ';
      len = 6;
    }
      // discard last three zero's giving MHz to match values below
    p = mempcpy(mempcpy(p, buf, len - 3), buf + len, (size_t)(ptr - buf) - len);
  }
  return p;
}

  // bump 199.99 to 200
static inline void
inc( char *p, size_t len )
{
  u32 n = 0;
  do
    n = n * 10 + (*p++ & 15);
  while( --len );
  utoa(p, n + 1);
}

int
main( int argc, const char *argv[] )
{
  char buf[IOSIZE], in[IOSIZE];
  char *s, *p;
  u32 lines = 15, delay = 0, count = 1;
  size_t len;
  FILE *fp;
  time_t t;
  u16 tmp;

  if( argc > 1 )
  {
    char *tail;
    delay = (u32)strtoul(argv[1], &tail, 10);
    if( tail == argv[1] )
      error("invalid delay");
    if( argc > 2 )
    {
      count = (u32)strtoul(argv[2], &tail, 10);
      if( tail == argv[2] )
	error("invalid count");
    }
    else
      count = UINT32_MAX;
  }

    // headers
  write(1, in, (size_t)(time_in_state(model(in, buf), buf) - in));

  buf[5] = buf[2] = ':';

  real cur_temp, min_temp = 1000.0, max_temp = 0.0;

  for( u32 i = 1; i <= count; ++i )
  {
    memset(p = buf + 8, ' ', 64);

      // CPU MHz "frequency(48)=2100000000"
    collect(measure_clock arm core, 14);
    len = 3 + (s[9] != '\n');
    if( s[len] == '9' )
      inc(s, len);
    memcpy((p += 8) - len, s, len);

      // CORE MHz "frequency(1)=200000000"
    result(13);
    if( s[3] == '9' )
      inc(s, 3);
    memcpy((p += 8) - 3, s, 3);
    pclose(fp);

      // VOLTAGE "volt=1.2000V"
    collect(measure_volts, 5);
    len = 5 + (s[5] != '0');
    if( s[len-2] == '0' && s[len-1] == '0' )
      len -= 2;
    memcpy((p += 9) - len, s, len);
    pclose(fp);

      // TEMPERATURE "temp=39.0'C"
    collect(measure_temp, 5);
    cur_temp = strtod(s, NULL);
    if( cur_temp < min_temp )
      min_temp = cur_temp;
    if( cur_temp > max_temp )
      max_temp = cur_temp;
    memcpy((p += 8) - 4, s, 4);
    pclose(fp);
    p += sprintf(p, "   %.1f   %.1f", min_temp, max_temp);

      // HEALTH "throttled=0x0\n"
    collect(get_throttled, 10);
    memcpy(&tmp, s + 2, 2);
    if( tmp == 0x0A30 )  // "0\n"
      memcpy((p += 8) - 3, "OK\n", 4);
    else
      p = stpcpy(p + 2, s);
    pclose(fp);

      // TIME
    time(&t);
    struct tm const * const now = localtime(&t);
    buf[0] = chr(now->tm_hour / 10);
    buf[1] = chr(now->tm_hour % 10);
    buf[3] = chr(now->tm_min / 10);
    buf[4] = chr(now->tm_min % 10);
    buf[6] = chr(now->tm_sec / 10);
    buf[7] = chr(now->tm_sec % 10);

      // Print record
    if( ++lines == 16 )
    {
      write(1, "Time         CPU    Core    Vcore    Temp....Min....Max  Health\n", 64);
      lines = 0;
    }
    write(1, buf, (size_t)(p - buf));

    if( i < count )
      sleep(delay);
  }
}
ejolson wrote:
Thu Apr 14, 2022 4:00 pm
If you reboot the Pi and run the tests again with a clean page allocator does the Lorenz still behave the same way at higher clock speeds? You can select to run only the Lorenz test with the -r8 option (at the expense of the Pi ratio being wrong after that).
Here is the Lorenz only run, immediately after a reboot.
Looks just as random. Interesting. The other benchmarks were reasonably linear, so why is this one different?

Code: Select all

frequency(48)=600169920
frequency(1)=258583008
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=1.33693 Mflops=2409.42

The 600MHz has Raspberry Pi ratio=2.56334
Making pie charts...done.
frequency(48)=700154304
frequency(1)=288101088
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=1.09299 Mflops=2947.17

The 700MHz has Raspberry Pi ratio=2.69575
Making pie charts...done.
frequency(48)=800191424
frequency(1)=317157728
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.994161 Mflops=3240.14

The 800MHz has Raspberry Pi ratio=2.76038
Making pie charts...done.
frequency(48)=900175808
frequency(1)=346609856
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.931206 Mflops=3459.2

The 900MHz has Raspberry Pi ratio=2.8059
Making pie charts...done.
frequency(48)=1000212864
frequency(1)=376457504
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.847189 Mflops=3802.25

The 1000MHz has Raspberry Pi ratio=2.87301
Making pie charts...done.
frequency(48)=1100249984
frequency(1)=405435072
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.776729 Mflops=4147.17

The 1100MHz has Raspberry Pi ratio=2.93606
Making pie charts...done.
frequency(48)=1200287104
frequency(1)=434953120
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.71486 Mflops=4506.09

The 1200MHz has Raspberry Pi ratio=2.99763
Making pie charts...done.
frequency(48)=1300324224
frequency(1)=464247072
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.714179 Mflops=4510.39

The 1300MHz has Raspberry Pi ratio=2.99834
Making pie charts...done.
frequency(48)=1400361344
frequency(1)=493659680
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.646167 Mflops=4985.13

The 1400MHz has Raspberry Pi ratio=3.0743
Making pie charts...done.
frequency(48)=1500398464
frequency(1)=522452640
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.569199 Mflops=5659.23

The 1500MHz has Raspberry Pi ratio=3.17334
Making pie charts...done.
frequency(48)=1600382848
frequency(1)=551377472
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.553911 Mflops=5815.42

The 1600MHz has Raspberry Pi ratio=3.19502
Making pie charts...done.
frequency(48)=1700419968
frequency(1)=581831552
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.564015 Mflops=5711.24

The 1700MHz has Raspberry Pi ratio=3.18061
Making pie charts...done.
frequency(48)=1800457088
frequency(1)=611600128
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.66178 Mflops=4867.52

The 1800MHz has Raspberry Pi ratio=3.05601
Making pie charts...done.
frequency(48)=1900494080
frequency(1)=639997568
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.729599 Mflops=4415.06

The 1900MHz has Raspberry Pi ratio=2.98237
Making pie charts...done.
frequency(48)=2000478464
frequency(1)=668658688
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.638214 Mflops=5047.25

The 2000MHz has Raspberry Pi ratio=3.08384
Making pie charts...done.
frequency(48)=2100515584
frequency(1)=699996096
pichart -- Raspberry Pi Performance OPENMP version 36

Lorenz 96            N=32768 K=16384 Workers=2 Sec=0.712677 Mflops=4519.89

The 2100MHz has Raspberry Pi ratio=2.99992
Making pie charts...done.
Last edited by jahboater on Fri Apr 22, 2022 10:50 am, edited 2 times in total.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Thu Apr 14, 2022 9:27 pm

jahboater wrote:
Thu Apr 14, 2022 5:06 pm
Here is the Lorenz only run, immediately after a reboot.
Looks just as random. Interesting. The other benchmarks were reasonably linear, so why is this one different?
Here is the comparison of the two runs:

Image

In a way the second run looks worse. Even with the normalization at the beginning giving the new run a 5 percent advantage, it manages to fall behind the original run at 1300 MHz and doesn't ever catch up in a significant way.

I find it amusing that 2000 MHz was a local maximum of performance for both runs and that the speed of the AXI bus divides evenly into 2000.

From what I can tell, the Lorenz 96 simulation is the only computational task in the group which might be vectorizable. Maybe NEON instructions are stalled for an extra cycle compared to non-vector operations when the AXI bus speed doesn't mesh with the CPU speed. If so, then this could explain the weirdness in the frequency scaling results.

I asked the dog developer for a second opinion, but the only reply was some barking about computer literacy and electromigration. For that I migrated Fido back to the dog house.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Tue May 24, 2022 8:23 pm

ejolson wrote:
Thu Jan 06, 2022 3:50 am
I installed gcc version 11.2 on the 4-core ARM Neoverse-N1 instance in the free-tier Oracle cloud and obtained

Code: Select all

$ ./pichart-openmp -t "4-core Altra (gcc-11.2)"
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=0.27152 Mops=3441.1
Merge Sort           N=16777216 Workers=8 Sec=0.435675 Mops=924.205
Fourier Transform    N=4194304 Workers=8 Sec=0.158582 Mflops=2909.37
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.107189 Mflops=30051.8

The 4-core Altra (gcc-11.2) has Raspberry Pi ratio=114.664
Making pie charts...done.
The updated compiler led to slightly slower integer benchmarks but slightly faster floating point with the net result being about 6 percent faster.

I wonder whether the new ARM Neoverse-N2 based Graviton 3 processors coming to the Amazon cloud will be much faster.

https://www.nextplatform.com/2021/12/02 ... rver-chip/
According to

https://aws.amazon.com/blogs/aws/new-am ... rocessors/

the Graviton 3 is now generally available. A 4-core instance with 8GB RAM costs 0.145 US$ per hour. Therefore, prior to the supply-chain problems an 8GB Pi 4B would pay for itself in

75/0.145 = 517 hours = 21 days.

On the other hand the Graviton 3 processors are reportedly faster, so it might be reasonable to compare the performance-normalized cost.

To this end I spun up an c7g.xlarge instance with Amazon Linux which seems to be based on RedHat 7 and then Ubuntu Server 22.04 for comparison.

The single and multi-core results were

Code: Select all

Pi Ratios for the c7g.xlarge EC2 Instance
  
          Amazon Linux     Ubuntu 22.04
         serial  4-core   serial  4-core
gcc      39.182  137.97   47.162  165.52
clang    41.597           46.773  166.58
Note that libomp was not available in Amazon Linux, so no parallel runs using clang were performed and no Pi ratios have been reported in that case.

The first thing of interest is that Ubuntu was uniformly about 20 percent faster. It is not clear to me whether this was on account of newer compilers in Ubuntu or a lucky placement of the VM instance in the cloud. In order to not spend the monthly dog-treat budget, no further investigation was performed.

I also noticed the Graviton 3 seemed about 40 percent faster than the ARM instances in the Oracle cloud. However, additional performance tuning and benchmarks would be required to draw meaningful conclusions.

At any rate, upon recalling the 4B yields a 31.42 Pi ratio, the numbers reported here suggest it would take

21 (165.52/31.42) = 113 days

for a Pi 4B to pay for itself after taking performance into account.

When comparing the cost trade offs between on-premise and cloud there are other factors such as networking and electricity that need to be taken into account. While the hybrid cloud solutions promoted by IBM

https://www.ibm.com/cloud/hybrid

appear to combine the best of both options, it should be pointed out that Power10 is not generally available in the cloud while capable ARM instances may be found in many places.

Could the Raspberry Pi be part of an effective hybrid cloud strategy? As he that passeth by, and meddleth with strife belonging not to him, is like one that taketh a dog by the ears, I decided to let the sleeping dog developer lie.

Maybe I'll ask the question later on our walk to the the park.

The transcript from the run is

Code: Select all

$ ./pichart-openmp -t c7g.xlarge
pichart -- Raspberry Pi Performance OPENMP version 36

Prime Sieve          P=14630843 Workers=4 Sec=0.181739 Mops=5141.03
Merge Sort           N=16777216 Workers=8 Sec=0.403133 Mops=998.81
Fourier Transform    N=4194304 Workers=8 Sec=0.0815835 Mflops=5655.23
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.0774795 Mflops=41575.2

The c7g.xlarge has Raspberry Pi ratio=165.519
Making pie charts...done.
$ ./pichart-serial -t c7g.xlarge
pichart -- Raspberry Pi Performance Serial version 36

Prime Sieve          P=14630843 Workers=2 Sec=0.7451 Mops=1253.96
Merge Sort           N=16777216 Workers=2 Sec=1.58285 Mops=254.385
Fourier Transform    N=4194304 Workers=1 Sec=0.319812 Mflops=1442.64
Lorenz 96            N=32768 K=16384 Workers=2 Sec=0.186274 Mflops=17292.9

The c7g.xlarge has Raspberry Pi ratio=47.1621
Making pie charts...done.
$ gcc --version
gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
and here are the Pi charts:

Image

Image

I find it reassuring, despite the other problems in the world, that there is still significant progress being made in cloud infrastructure and performance.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Wed May 25, 2022 8:18 am

ejolson wrote:
Wed May 27, 2020 6:58 am
Following an updated version of the calculation given in

https://www.raspberrypi.org/forums/view ... 0#p1537398

implies the price I can charge my little brother for using the Pi 4B has been reduced from US$ 0.05712 to about US$ 0.04873 per hour.
Relative to the Graviton 3 the performance-normalized hourly charge for the Pi 4B would be

0.145*31.42/165.52 = US$ 0.02752 per hour.

Therefore, even after the recent money-printing and inflation, the new Graviton 3 instances are about twice as cost effective as the original Graviton. While that's not as cheap as the free 4-core ARM instances in the Oracle cloud, I'm sticking with Raspberry Pi since the included version of Mathematica more than offsets the cost of paying the electric bill.

ejolson
Posts: 10725
Joined: Tue Mar 18, 2014 11:47 am

Re: A Pi Pie Chart

Sun Jun 05, 2022 4:22 am

ejolson wrote:
Wed May 25, 2022 8:18 am
ejolson wrote:
Wed May 27, 2020 6:58 am
Following an updated version of the calculation given in

https://www.raspberrypi.org/forums/view ... 0#p1537398

implies the price I can charge my little brother for using the Pi 4B has been reduced from US$ 0.05712 to about US$ 0.04873 per hour.
Relative to the Graviton 3 the performance-normalized hourly charge for the Pi 4B would be

0.145*31.42/165.52 = US$ 0.02752 per hour.
Since the four-core Graviton 2 instance measured previously gave a Pi-ratio of about 100.148 while the new Graviton 3 measured 165.52, then the new processor is 1.65 times faster, assuming the differences between Ubuntu 20.04 and 22.04 are not significant.

Another comparison between Graviton 2 and 3 processors appears in

https://www.daemonology.net/blog/2022-0 ... ton-3.html

That study found a 1.4 times speedup when compiling a number of open-source projects using the Graviton 3.

It's possible some of the speedup is due to differences in how the storage was provisioned on the respective instances. On the other hand 1.4 lies between the extremes of the speedups observed for the individual tests in the Pi chart program, which makes it a plausible result for a compute-bound task.

I'd provide a detailed analysis except the local web here server is offline due to increased security on account of the war. There is also the problem of currently being surrounded by mosquitos that are attracted to the LCD display.

Although the BARK™ may be worse than the BYTE

Image
https://wiki.theretrowagon.com/wiki/BYT-8

those insect bites are more irritating than any bark I've known.

lurk101
Posts: 2165
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: A Pi Pie Chart

Sun Jun 26, 2022 2:18 pm

The links to the Pie Pi Chart source code from the original post all seem to be dead!

Anyone have a copy?
History doesn’t repeat itself, it rarely even rhymes.

Return to “General discussion”