[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Getting rid of alignment faults in userspace



On Sat, 18 Jun 2011, Arnaud Patard wrote:

> Dave Martin <dave.martin@linaro.org> writes:
> Hi,
> 
> > Hi all,
> >
> > I've recently become aware that a few packages are causing alignment
> > faults on ARM, and are relying on the alignment fixup emulation code in
> > the kernel in order to work.
> >
> > Such faults are very expensive in terms of CPU cycles, and can generally
> > only result from wrong code (for example, C/C++ code which violates the
> > relevant language standards, assembler which makes invalid assumptions,
> > or functions called with misaligned pointers due to other bugs).
> >
> > Currently, on a natty Ubuntu desktop image I observe no faults except
> > from firefox and mono-based apps (see below).
> >
> > As part of the general effort to make open source on ARM better, I think 
> > it would be great if we can disable the alignment fixups (or at least
> > enable logging) and work with upstreams to get the affected packages
> > fixed.
> >
> > For release images we might want to be more forgiving, but for development
> > we have the option of being more aggressive.
> >
> > The number of affected packages and bugs appears small enough for the
> > fixing effort to be feasible, without temporarily breaking whole
> > distros.
> >
> >
> > For ARM, we can achieve the goal by augmenting the default kernel command-
> > line options: either
> >
> >     alignment=3
> >         Fix up each alingment fault, but also log the faulting address
> >         and name of the offending process to dmesg.
> >
> >     alignment=5
> >         Pass each alignment fault to the user process as SIGBUS (fatal
> >         by default) and log the faulting address and name of the
> >         offending process to dmesg.
> 
> iirc, someone sent some months/years ago a patch to change the default

That was me.

> but it has been rejected because there are (was ?) some libc including
> glibc doing some unaligned access [1], and this can happen early in the
> boot process. In this kind of case, things like getting a sigbus would
> hurt.

This is only partly true.

Rewind about 15 years ago when all that Linux supported was ARMv3.  On 
ARMv3 there is no instruction for doing half-word loads/stores, and no 
instruction to sign extend a loaded byte.

In those days, the compiler was relying on a documented and 
architecturally defined behavior of misaligned loads/stores which is to 
rotate the bytes comprising the otherwise aligned word, the rotation 
position being defined by the sub-word offset.  Doing so allowed for 
certain optimizations to avoid extra shifts and masks.

Then a bunch of binaries were built with a version of GCC making use of 
those misaligned access tricks.

Then came along ARMv4 with its LDRH, LDRSH, and LDRSB instructions, 
making those misaligned tricks unnecessary.  Hence GCC deprecated those 
optimizations.  Today only the old farts amongst us still remember about 
this.

So for quite a while now, having a misaligned access on ARM before ARMv6 
is quite likely to not produce the commonly expected result.  That's why 
there is code in the kernel to trap and fix up misaligned accesses.  
However, it is turned off by default for user space.  Why?

Turns out that a prominent ARM developer still has binaries from the 
ARMv3 era around, and the default of not fixing up misaligned user space 
accesses is for remaining compatible with them.

So if you do have a version of glibc that is not from 15 years ago (that 
would have to be a.out and not ELF if it was) then you do not want to 
let misaligned accesses go through unfixed, otherwise you'll simply have 
latent data corruption somewhere.

> Also, as noted by someone else in the thread, you do want to test on
> something like armv5* or v4* because there are high chances than the
> trap used by the alignment fix won't be triggered at all on >= armv6.

Given that Linaro is working only with Thumb2-compiled  user space, that 
implies ARMv6 and above only.

> [1] See commit log of commit d944d549aa86e08cba080396513234cf048fee1f.

And note the "if not fixed up, results in segfaults" in that log, 
meaning that the current default is wrong for that case.


Nicolas


Reply to: