Posts: 1
Joined: Tue Nov 05, 2013 11:15 pm

valgrind unhandled instruction 0xF1010200

Tue Nov 05, 2013 11:29 pm

When running valgrind on a c++ compiled binary that executes syntax:
cout << 6;
cout << endl;
or any cout << <integer>

valgrind will report
unhandled instruction: 0xF1010200

my environment:
Linux raspberrypi 3.6.11+ #538 PREEMPT Fri Aug 30 20:42:08 BST 2013 armv61 GNU/Linux
gcc 4.7.2 (Debian 4.7.2-5+rpi1)

This error has been discussed in the valgrind bug list:

and they respond:
This is "SETEND BE" (encoding A1), which means "switch to big-endian mode".
So (a) this program is doing something pretty weird and (b) I'm not surprised valgrind isn't supporting it.
Looking at the backtrace I suspect this is the following memcmp-for-rpi implementation: ... cmp.S#L214
which ultimately ends with this comment:
On further consideration it's not merely a question of "little enthusiasm", but more like "would require major rework of the ARM-level JIT machinery to fix". So I'm going to WONTFIX this.
I have the same issue as the valgrind bug OP.

Before I submit a formal bug, I thought I'd this being encountered by anyone else?

That memcmp may be lark, but it does seem strange to switch to Big Endian there.

Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 31342
Joined: Sat Jul 30, 2011 7:41 pm

Re: valgrind unhandled instruction 0xF1010200

Wed Nov 06, 2013 1:48 pm

I've been told....

On ARM11, the SETEND instruction is one cycle, so very quick (much slower on Cortex), and the memcmp is about twice as fast like this as when using it without the SETEND. It could be changed to a slower version without a huge hit, but there is another place where the SETEND is used, H264 somewhere, and removing that would make it unusable.

SETEND missing in QEMU and Valgrind is indeed a bug in those apps.

Principal Software Engineer at Raspberry Pi Ltd.
Working in the Applications Team.

Posts: 3
Joined: Wed Nov 06, 2013 1:57 pm

Re: valgrind unhandled instruction 0xF1010200

Wed Nov 06, 2013 2:19 pm

memcmp has got to be close to the ideal candidate for the use of SETEND. If you look at its definition, it returns <0, =0 or >0 depending upon how the data blocks compare, on an unsigned byte-by-byte basis, with bytes at lower addresses more significant than those at higher addresses. Thus if you interpret 4 consecutive bytes as a big-endian word, a single 32-bit unsigned compare effectively acts as a 4 SIMD unsigned byte compares in a single cycle. When in SETEND BE mode, the words read from memory are loaded into registers already in big-endian mode for zero cycles overhead. These two factors combined lead to a very tight, fast inner loop. There is a REV instruction in ARM11 for explicitly swapping the endianness of a value already in a register, but it can't compete against zero cycles for speed.

Switching endian modes is indeed very fast on ARM11, and even on CPUs where it isn't, it's only performed once on each enter/exit of memcmp.

Unfortunately it does seem to be a tough instruction for the emulator people to implement though, and because ARM defined it as a mandatory instruction, there's no capabilities register by which they can indicate that their virtual CPU doesn't provide it.

Posts: 6
Joined: Fri May 08, 2015 6:45 pm

Re: valgrind unhandled instruction 0xF1010200

Fri May 08, 2015 6:49 pm

Just wondering if there's been any change to the status of this. This bug makes valgrind unusable on Raspberry Pi, which is a real shame. Unfortunately, it looks like the valgrind folks have chosen to make this bug a 'won't fix'.

Posts: 19132
Joined: Tue Jul 17, 2012 3:02 pm

Re: valgrind unhandled instruction 0xF1010200

Sat May 09, 2015 10:56 am

memcmp has got to be close to the ideal candidate for the use of SETEND
I don't buy this explanation for using SETEND at all.

If you are comparing blobs of memory clearly doing a 32 bit word at a time is going to be much faster than doing it byte by byte.

So what if you are making the comparison with big endian or little endian reads and compares?

Two things can happen:

1) Those 32 bit reads and compares don't fail. Your data is the same. This takes the same time no matter if done in big or little endian mode.

2) At some point the 32 bit comparison fails. Now you have to find out which of those 32 bit quantities is bigger, or smaller than the other. For this you will need to get the bytes into the right order for comparison as 32 bits. Or perhaps make the last comparison again, this time byte by byte.

My contention is that the little complication at the end of a failed comparison when in little endian mode is not a significant overhead. Probably hardly noticeable in the scale of the run time of the vast majority of programs.

As such I side with the valgrind guys. It's a bug in memcmp that should be fixed. Having working code analysis tools is preferable to having a 0.1% speed up in memcmp.

My guess is that when you want to use valgrind you can supply link in your own version of memcmp that does not use SETEND BE, perhaps even a simple memcmp written in C. This should be possible as library functions are usually specified with "weak" linkage so if the same function exists in your application there is no error.

This will of course be a bit slower. But then valgrind is renowned for being slow anyway.

The other option is to use the clang compiler and use it's analysis tools, MemorySanitizer, AddressSanitizer, LeakSanitizer, ThreadSanitizer, which are far superior to valgrind anyway.
Slava Ukrayini.

Return to “Raspberry Pi OS”