Pi 5 L1 cache size
I keep seeing mention of the Pi5 having 64 KB L1 cache per core, but then I see the specs for the Cortex A76 showing separate L1 data and instruction caches, each of 64 KB. I tried using tools like lscpu which worked on the Pi4 but the cache info is missing for the Pi5. So 64 KB combined D and I L1 cache per core or 64 KB D + 64 KB I?
Re: Pi 5 L1 cache size
I think it is 64kB data + 64kB instructions. Install package cpuinfo:
Code: Select all
$ cache-info
Max cache size (upper bound): 4194304 bytes
L1 instruction cache: 4 x 64 KB, 4-way set associative (256 sets), 64 byte lines, shared by 1 processors
L1 data cache: 4 x 64 KB, 4-way set associative (256 sets), 64 byte lines, shared by 1 processors
L2 data cache: 4 x 256 KB (inclusive), 8-way set associative (512 sets), 64 byte lines, shared by 1 processors
L3 data cache: 1 MB (exclusive), 16-way set associative (1024 sets), 64 byte lines, shared by 4 processorsRe: Pi 5 L1 cache size
Thanks for the reply. I wasn't aware of the cache-info command but there seems to be something strange going on. For the Pi 4 the info provided seems spot on.
This agrees with data given for the BCM2711 SoC used in the Pi 4 to be found at https://www.raspberrypi.com/documentati ... ssors.html
However for the Pi 5 the cache-info command output looks a bit strange.
Whereas this shows the separate 64 KB data and instruction caches (which I believe to be the case), the L2 and L3 values disagree with the data provided for the BCM2712 used in the Pi 5 in the same document which gives figures that double the sizes of both cache levels.
The reason I asked the question in the first place is because there seems to be a lot of conflicting information out there such as that given in this review https://www.pcmag.com/reviews/raspberry-pi-5.
Code: Select all
pi4$ cache-info
L1 instruction cache: 4 x 48 KB, 3-way set associative (256 sets), 64 byte lines, shared by 1 processors
L1 data cache: 4 x 32 KB, 2-way set associative (256 sets), 64 byte lines, shared by 1 processors
L2 data cache: 1 MB (inclusive), 16-way set associative (1024 sets), 64 byte lines, shared by 4 processors
Code: Select all
Caches: 32kB data + 48kB instruction L1 cache per core. 1MB L2 cache.
Code: Select all
pi5$ cache-info
L1 instruction cache: 4 x 64 KB, 4-way set associative (256 sets), 64 byte lines, shared by 1 processors
L1 data cache: 4 x 64 KB, 4-way set associative (256 sets), 64 byte lines, shared by 1 processors
L2 data cache: 4 x 256 KB (inclusive), 8-way set associative (512 sets), 64 byte lines, shared by 1 processors
L3 data cache: 1 MB (exclusive), 16-way set associative (1024 sets), 64 byte lines, shared by 4 processors
Code: Select all
512KB L2 per core, 2MB shared L3
Re: Pi 5 L1 cache size
I was also under the impression the L3 cache size is 2MB. The cache-info output seems to disagree with the published specifications.Serifini wrote: ↑Sat Jan 13, 2024 3:50 pmThanks for the reply. I wasn't aware of the cache-info command but there seems to be something strange going on. For the Pi 4 the info provided seems spot on.This agrees with data given for the BCM2711 SoC used in the Pi 4 to be found at https://www.raspberrypi.com/documentati ... ssors.htmlCode: Select all
pi4$ cache-info L1 instruction cache: 4 x 48 KB, 3-way set associative (256 sets), 64 byte lines, shared by 1 processors L1 data cache: 4 x 32 KB, 2-way set associative (256 sets), 64 byte lines, shared by 1 processors L2 data cache: 1 MB (inclusive), 16-way set associative (1024 sets), 64 byte lines, shared by 4 processorsHowever for the Pi 5 the cache-info command output looks a bit strange.Code: Select all
Caches: 32kB data + 48kB instruction L1 cache per core. 1MB L2 cache.Whereas this shows the separate 64 KB data and instruction caches (which I believe to be the case), the L2 and L3 values disagree with the data provided for the BCM2712 used in the Pi 5 in the same document which gives figures that double the sizes of both cache levels.Code: Select all
pi5$ cache-info L1 instruction cache: 4 x 64 KB, 4-way set associative (256 sets), 64 byte lines, shared by 1 processors L1 data cache: 4 x 64 KB, 4-way set associative (256 sets), 64 byte lines, shared by 1 processors L2 data cache: 4 x 256 KB (inclusive), 8-way set associative (512 sets), 64 byte lines, shared by 1 processors L3 data cache: 1 MB (exclusive), 16-way set associative (1024 sets), 64 byte lines, shared by 4 processorsThe reason I asked the question in the first place is because there seems to be a lot of conflicting information out there such as that given in this review https://www.pcmag.com/reviews/raspberry-pi-5.Code: Select all
512KB L2 per core, 2MB shared L3
In the end performance is what's important. I wonder if Chips and Cheese
https://chipsandcheese.com/
has a test that could confirm L3 cache size.
Re: Pi 5 L1 cache size
cache-info is only accurate for specific SoC models. It doesn't probe the hardware and uses the CPU part number shown in /proc/cpuinfo to look it up in an internal database. The data for Cortex-A76 is incomplete as it assumes L2 is always 256K and L3 is always 1M. There is a special case handler for Kirin 980 to adjust those values. Everything else including BCM2712 gets the default values.
There are cache description registers but I don't know how to access them under Linux. cleverca22 has got code to display it but its bare metal.
Re: Pi 5 L1 cache size
It looks like cache-info is used to tune Pytorch for various processors. In that case, it is better to error with too small than too large.trejan wrote: ↑Sat Jan 13, 2024 6:20 pmcache-info is only accurate for specific SoC models. It doesn't probe the hardware and uses the CPU part number shown in /proc/cpuinfo to look it up in an internal database. The data for Cortex-A76 is incomplete as it assumes L2 is always 256K and L3 is always 1M. There is a special case handler for Kirin 980 to adjust those values. Everything else including BCM2712 gets the default values.
There are cache description registers but I don't know how to access them under Linux. cleverca22 has got code to display it but its bare metal.
Re: Pi 5 L1 cache size
It looks more like it is incomplete to me than intentionally aiming low. Cortex-A76 L2 is 128kB-512kB and L3 is 0kB-4MB yet it picks 256kB and 1MB.
-
cleverca22
- Posts: 9604
- Joined: Sat Aug 18, 2012 2:33 pm
Re: Pi 5 L1 cache size
viewtopic.php?p=2130828&hilit=MIDR#p2130828Unfortunately these OS commands can return hardcoded values for compatibility purposes.
You can check CPU by direct reading it's registers. Here is example program that reads some ID registers:
when i run that on my pi5, i get:
Code: Select all
MIDR_EL1: 0x414fd0b1
Implementer: 0x41
Variant: 0x4
Architecture: 0xf
PartNum: 0xd0b
Revision: 0x1
[Reserved]: 0x0
VPIDR_EL2: SIGILL
REVIDR_EL1: 0
ID_AA64ISAR0_EL1: 0x100010211120
ID_AA64ISAR1_EL1: 0x100001
MVFR0_EL1: 0x200
MVFR1_EL1: 0x10011100
MVFR2_EL1: 0
and attempting to read CCSIDR_EL1 results in SIGILL
this would likely have to be moved into the kernel to function?
Re: Pi 5 L1 cache size
CCSIDR_EL1 got taken out of the kernel. That is why sysfs no longer reports size.
I kludged together a kernel module to read it. Arm notes that it may not resemble the real cache config but it looks okay to me.
L1 data cache 64 kB, 4-way set associative (256 sets), 64 byte lines. Attributes=WriteBack ReadAllocate WriteAllocate
L1 instruction cache 64 kB, 4-way set associative (256 sets), 64 byte lines. Attributes=ReadAllocate
L2 unified cache 512 kB, 8-way set associative (1024 sets), 64 byte lines. Attributes=WriteBack ReadAllocate WriteAllocate
L3 unified cache 2048 kB, 16-way set associative (2048 sets), 64 byte lines. Attributes=WriteBack ReadAllocate WriteAllocate
I kludged together a kernel module to read it. Arm notes that it may not resemble the real cache config but it looks okay to me.
L1 data cache 64 kB, 4-way set associative (256 sets), 64 byte lines. Attributes=WriteBack ReadAllocate WriteAllocate
L1 instruction cache 64 kB, 4-way set associative (256 sets), 64 byte lines. Attributes=ReadAllocate
L2 unified cache 512 kB, 8-way set associative (1024 sets), 64 byte lines. Attributes=WriteBack ReadAllocate WriteAllocate
L3 unified cache 2048 kB, 16-way set associative (2048 sets), 64 byte lines. Attributes=WriteBack ReadAllocate WriteAllocate