[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: armhf SIGILL, Illegal Instruction



On Wed, Sep 29, 2021 at 4:06 PM Ash Hughes <sehguh.hsa@gmail.com> wrote:
>
> Hi,
>
> I've been getting some programs terminated with SIGILL today, and I'm
> trying to find out if this is a package issue or if Debian (Bullseye) is
> no longer compatible with my ARM machine. I first got an error with
> onedrive, with gdb output:
>
> #0  0xb6948ca8 in gc.impl.conservative.gc.Gcx.fullcollect(bool) ()
>     from /usr/lib/arm-linux-gnueabihf/libdruntime-ldc-shared.so.94
>
> which is "vldr    d18, [pc, #216] ;".
>
> I then tried to run ldc2, and I got something similar:
>
> Core was generated by `ldc2 -c --output-o -conf= -w -mattr=-neon -O3
> -release -relocation-model=pic -d'.
> Program terminated with signal SIGILL, Illegal instruction.
> #0  0x0089e15c in
> dmd.parse.Parser!(dmd.astcodegen.ASTCodegen).Parser.parsePrimaryExp() ()
>
> which is also a vldr instruction ("vldr    d16, [r6, #80]  ; 0x50")
>
> Finally, I tried to compile ldc2 myself and running it I got:
>
> #0  0xb4a6eabc in ?? () from /usr/lib/arm-linux-gnueabihf/libLLVM-11.so.1
>
> also vldr ("vldr        d16, [sp, #8]")
>
> It looks like the vldr instruction is being used in several LLVM
> packages, in a way my CPU doesn't like. Here's my cpuinfo:
>
> processor       : 0
> model name      : ARMv7 Processor rev 1 (v7l)
> BogoMIPS        : 37.39
> Features        : half thumb fastmult vfp edsp thumbee vfpv3 vfpv3d16
> tls idivt
> CPU implementer : 0x56
> CPU architecture: 7
> CPU variant     : 0x1
> CPU part        : 0x581
> CPU revision    : 1
>
> Hardware        : Marvell Armada 370/XP (Device Tree)
> Revision        : 0000
> Serial          : 0000000000000000
>
> I don't have neon, although I think armhf doesn't require it, unless
> this has changed for Bullseye? If neon isn't required for Debian armhf,
> does this mean some LLVM related packages could be built differently to
> improve compatibility?

I think John Paul Adrian Glaubitz (with the help of others) on the
PowerPC mailing list determined that Autools is the problem. Autotools
is using an M4 macro that is selecting the wrong platform or features.
It is new behavior.

Also see Bug #995223: libffi: SIGILL on powerpc and ppc64 systems
since libffi8, https://lists.debian.org/debian-powerpc/2021/09/msg00051.html.
In particular, from a followup at
https://lists.debian.org/debian-powerpc/2021/09/msg00077.html:

<QUOTE>
It turns out that m4/ax_gcc_archflag.m4 contains code to detect the
baseline of the host system and sets the GCC architecture accordingly.

Thus, a libffi compiled on a POWER8 machine will not work on a POWER5
machine as the compiler is emitting POWER8 instructions in this case.

Since the m4 script contains such a host enviroment detection for aarch64
as well [1], this bug can potentially affect arm64 which is a release
architecture.

We should therefore pass "--enable-portable-binary" in debian/rules.

[1] https://github.com/libffi/libffi/blob/master/m4/ax_gcc_archflag.m4#L209
</QUOTE>

This is also of interest
https://lists.debian.org/debian-powerpc/2021/09/msg00048.html. There's
a lot of back-and-forth, but it is where the problem is revealed.

I could be mistaken, so take it with a grain of salt.

Jeff


Reply to: