[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: armhf: abel.d.o hardware status ?



On Wed, Jun 29, 2022 at 11:34 AM Wookey <wookey@wookware.org> wrote:
>
> On 2022-06-29 15:13 +0200, Mathieu Malaterre wrote:
> > On Wed, Jun 29, 2022 at 2:48 PM Wookey <wookey@wookware.org> wrote:
>
> > > What exactly is going wrong when you try to use valgrind?
> >
> > Well you should see something like this on abel.d.o:
> >
> > * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224#27
> >
> > Basically anytime you build valgrind using gcc-11 or gcc-12 (debian
> > sid package), you get this weird illegal instruction:
> >
> > ```
> > % ./vg-in-place
> > Illegal instruction
> > ```
>
> I have a strong suspicion that this is neon-itis. The issue generally
> manifests as 'illegal instuction' (i.e a neon instruction is issued on
> hardware that isn't able to execute it). It has always been the case
> that software should not assume neon is present on v7 (because it
> isn't on all hardware), and most code gets this right, but I've
> recently seen gcc putting those instuctions into the startup code
> (where the C-environment is set up and variables allocated) which gets
> executed _before_ any functions checking for which HWCAPS to enable,
> and thus which code to run.
> ...
> Also if you run the program under gdb (on abel) and when it barfs do:
> (gdb) disassemble
> and look for instructions that start with 'v', like 'vmov.i32'
> that will confirm which instruction is tripping it up.
>
> This bug has an example of the problem:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998043
>
> I got partway thorugh a long followup with some details of possible
> fixes some months ago but got sidetracked (and oh look it's been
> pending for 6 months already).
>
> The reason this has broken appears to be that gcc has changed the way
> the fpu is specified/defaulted, so neon _and_ fp are enabled by
> default if no specific fpu option is given. (i.e we just set
> -march=armv7). It used to be that -march=armv7 implied +nosimd.  (or
> something like that - I never quite got to the bottom of it enough to
> be sure eactly what the right general or specific fix was).
>
> If you rebuild with
> -march=armv7-a+nosimd+nofp
> or
> -march=armv7-a+nosimd+fp
> you should be able to determine if being more explicit about the fp and simd(neon) instructions used makes it behave.
>
> It seems likely that you have hit this problem.
> I think this is the same thing too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982794
> (Firefox dying with illegal instruction on non-neon hardware)
>
> I _suspect_ that debian needs to change the default flags to actually
> say 'armv7+fp+nosimd' by default so that we get what we expect (and
> define as the base ISA) and it doesn't depend on what hardware the
> build was done on.

Also see GCC Bug 104455, where you can't specify just -march=armv7-a
with GCC 11 (and probably above).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104455 .

GCC really screwed folks by requiring them to declare the ISA at
compile time (like -march=armv7-a -mfpu=neon). You have to use the
options to use the ISA, but then GCC thinks it can use it too.
Meanwhile, your code is guarded at runtime while GCC's code SIGILL's.
It's been a constant source of problems for me on x86, ARM and
PowerPC.

I also think Debian got it wrong recently when they tied NEON to
ARMv7-a. Making the leap that ARMv7 includes NEON was simply a
mistake. But I understand why they did it for their standard build
configuration. They wanted to get rid of armel and ARMv5 support.

Microsoft compilers got it right. You can use any ISA the compiler
supports without options. It is up to you to guard the code properly
at runtime. And when you use an option like /machine:avx, that tells
the compiler it can use up to the specific ISA.

Jeff


Reply to: