EABI/OABI incompatibility
I tracked down a strange bug last week that I think folks should know
about. It involves an incompatibility when using OABI binaries over an
EABI+OABI kernel. There are several of these that I know about so far.
The first is that structure packing is different between EABI and OABI
which is well documented on the wiki,
http://wiki.debian.org/ArmEabiPort The result of this difference is
that user space binaries that pass structures to the kernel via ioctl
calls must be handled carefully. Specifically, both the kernel and the
user space binary need to be compiled with compilers which use the same
structure packing mechanism. Alsa is perhaps the most famous offender here.
The second is that some versions of EABI compilers, notably the gcc-3
based eabi compilers from code sourcery, used a set of "shims" in
glibc. Instead of using the EABI kernel trap mechanism, they used the
OABI kernel trap mechanism, despite being "EABI". This means that
"EABI" binaries produced using these compilers will not run on EABI-only
kernels but instead require a kernel which supports OABI.
The third, the new one, is strange. Using a gcc-3 and a glibc-2.3.5,
I've run into a situation where something is apparently composing a
trampoline on the stack which includes an "swi 0". This is an OABI
toolchain and environment which is using the "swi 0", (the EABI trap
mechanism). When using an EABI+OABI kernel, the swi 0 trap means that
the actual trap number is in R7, but in this case, the contents of r7
are essentially random garbage. This leads the kernel to SIGILL the
process, which appears as spurious SIGILL's to a programmer.
Using an OABI-only kernel solves this problem, but whereever the
trampoline is coming from needs to be checked and perhaps updated. I'm
posting here because this represents a form of EABI/OABI incompatibility
that I hadn't seen before. If you find yourself staring at a process
which appears to be getting spurious SIGILL signals, and you're
debugging an OABI binary, you might want to check whether you still see
that behavior using an OABI kernel.
--rich
Reply to: