[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#738575: pthread: segfault in libpthread on Intel Galileo board



Hi,

On 2015-05-11 20:43, Kinsella, Ray wrote:
> Package: libc6-i386
> Version: 2.19-17
> 
> Hi all,
> 
> I thought that the glibc mailing list was a better home for this
> discussion than Debian-devel, if I have err'ed please redirect me.
> 
> I am following up on Bug#738575: 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=738575
> 
> Put simply the Intel X1000 (Pentium ISA) has a bug on the LOCK prefix,
> that cannot be fixed with an updated microcode and cannot be worked
> around by trapping in the Kernel. The only workable solution is to
> simply not use the LOCK prefix on the Intel X1000. 

Is Intel planning to release new chips without this bug?


> The most obvious way to fix this bug is strip the LOCK prefix from glibc
> on X1000. I have prototyped the following on wheezy.

Are you sure this prefix is only present in the libc, and not in other
binaries?

> 1. Kernel being aware it is running on a platform with the bug and
> indicating this in proc cpuinfo.
> 2. The Kernel hinting to the runtime linker that the Microprocessor has
> the "nolock" bug through a hwcap, in much the same way as the Xen
> "nosegneg" hinting works.
> 3. The runtime linker then loads a specific version of libc with the
> LOCK prefix stripped, using the omit-lock-prefix assembly option.
> (appending -Wa,momit-lock-prefix=yes to extra_cflags).
> 
> The trouble with this fix is that we assume that i386 Libc is the lowest
> common denominator - it should always work, but on X1000 it is broken by
> virtue of containing the LOCK prefix.
> 
> This is where the X1000 differs to Xen, on Xen you can use the default
> i386 Libc at a performance penalty. Xen just works faster if you have
> the "nosegneg" libc version installed. 
> 
> If there is always a "nolock" version of the Glibc installed there is no
> issue. The alternative strategy is to forget about a version of libc
> specifically
> to tackle the X1000 bug and strip the default i386 libc version of the
> lock
> prefix - I considered this problematic.

I guess that's a wrong assumption. The hardware capabilities is
sometimes disabled (for example during an upgrade to avoid having a mix
of two incompatible version) and in that case the default libc is used.
This would break on the X1000. Note also that it would not fix the issue
of static binaries.
 
> In theory this might work as all 32bit non-pae Kernels are non-SMP,
> however my concern is this would break the i386 Libc usefulness as
> failover
> on 32bit 686-pae.
> 
> Hoping the maintainer could comment on this - 
> 
> 1. Are we safe stripping i386 version of libc of the LOCK prefix, or am
> I correct this would break 686-pae in failover situations (i.e.
> situations where the cmov optimized version is missing for some reason).

Not everybody has the optimized version installed at every moment, and
as explained above the default libc might be used at some moment. This
will therefore break existing systems.

> or
> 
> 2. Should we do as I describe above, creating a version of glibc
> specifically with the LOCK prefix removed. 

I would rather not add a new glibc build pass just for the X1000 and as
said above I don't think it would work.

> Trying to balance complexity (a new glibc target) versus not breaking
> any existing platforms. 

If the problem is really limited to the libc, one solution would be to
use the STT_GNU_IFUNC to provide a different version of the affected
functions. I guess that should then be implemented directly upstream.

Alternatively as it triggers a segmentation fault, this could probably
be trapped and emulated in the kernel, just like it's done for the FPU
on some architectures.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net


Reply to: