[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#762195: libc6: libpthread: hardware-assisted lock elision hazardous on x86



On Fri, Sep 19, 2014 at 10:09:24AM -0300, Henrique de Moraes Holschuh wrote:
> Package: libc6
> Version: 2.19-0experimental0
> Severity: grave
> Justification: causes non-serious data loss
> 
> libpthread-2.19 has HLE (hardware-assisted lock elision) support.
> Unfortunately, on Intel-based x86 processors, the use of HLE is currently
> hazardous.
> 
> Summary:  Use of HLE on all current Intel Haswell processors (the only x86
> processors with HLE support so far) can cause unpredictable system
> behaviour, including the possibility of hangs and memory corruption.
> Updating the microcode on these Intel Haswell processors when Intel TSX is
> in use by libpthread will cause running processes linked to libpthread to be
> killed with SIGILL.
> 
> This issue is, AFAIK, impossible to work around in the kernel.  Since glibc
> uses the cpuid instruction directly, the kernel cannot prevent libpthreads
> from attempting to use Intel TSX.
> 
> Non-free will work around the microcode update issue by enforcing that all
> microcode updates be done in the initramfs (i.e. require a reboot to apply,
> and require initramfs).
> 
> Unfortunately, this is not going to be enough as most users don't have
> intel-microcode installed in their Intel-based systems, and therefore would
> still be at risk of data loss or data corruption due to erratum HSD136.
> 
> Please disable hardware-assisted lock elision (HLE) on X86/X86-64 Intel
> Haswell Processors in libpthreads.
> 
> 
> Details:
> 
> 
> On unpatched Intel processors, HLE will hit erratum HSD136:
> 
> HSD136.  Software Using Intel® TSX May Result in Unpredictable System
>          Behavior
> 
> Problem: Under a complex set of internal timing condit ions and system
> 	 events, software using the Intel TSX (Transactional Synchronization
> 	 Ex tensions) instructions may result in unpredictable system
> 	 behavior.
> 
> (Erratum description from: "Desktop 4th Generation Intel Core Processor Family
> Specification Update, June 2013, #328899-001).
> 
> This erratum is serious enough for Intel to take the PR hit and withdraw the
> feature on all Haswell cores, including the just-launched Haswell-EP E5v3
> Xeons.  (ref:
> http://www.anandtech.com/show/8376/intel-disables-tsx-instructions-erratum-found-in-haswell-haswelleep-broadwelly
> ).
> 
> On patched Intel processors, Intel TSX will be disabled by the microcode.
> When disabled, any Intel TSX instructions will generate an illegal opcode
> trap.  Intel TSX support supposedly can be re-enabled *during system boot*
> by the UEFI firmware through an undisclosed method.
> 
> Unfortunately, the act of updating the microcode will immediately disable
> Intel TSX, causing all running processors linked to libpthread-2.19 to trap
> and crash with SIGILL:
> 
> [ 43.606830] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x1a
> [ 43.608466] microcode: CPU0 updated to revision 0x1c, date = 2014-07-03
> [ 43.608494] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x1a
> [ 43.609327] microcode: CPU1 updated to revision 0x1c, date = 2014-07-03
> [ 43.609352] do_trap: 267 callbacks suppressed
> [ 43.609354] traps: rs:main Q:Reg[1343] trap invalid opcode ip:7f32abd0b7ab
> sp:7f32a9062848 error:0
> [ 43.609355] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x1a
> [ 43.609358] in libpthread-2.19.so[7f32abcfa000+18000]
> [ 43.610204] microcode: CPU2 updated to revision 0x1c, date = 2014-07-03
> [ 43.610225] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x1a
> [ 43.611081] microcode: CPU3 updated to revision 0x1c, date = 2014-07-03
> [ 43.611507] traps: systemd[1] trap invalid opcode ip:7f844f84a7ab
> sp:7fff2ccf7e28 error:0 in libpthread-2.19.so[7f844f839000+18000]
> [...]
> 
> Ref: https://bugs.launchpad.net/intel/+bug/1370352

It looks like Intel did crap there, and that the GNU libc has to handle
this crap. The microcode update could have stop advertising the
instructions while still supporting them...

> It is unknown at this time what will happen on future microcode updates.  It
> is entirely possible that the act of updating the microcode will always
> reset Intel TSX to its default "disabled" state, regardless of whether the
> BIOS had force-enabled it or not at boot.   This is the reason why I will
> drop support for microcode updates outside of the initramfs in non-free.
> 
> 
> Therefore, due to erratum HSD136 and the lack of widespread use of microcode
> updates, libpthread-2.19 must stop using HLE on the problematic Intel
> processors.

I will try to work on a patch but this won't be enough, until the users
reboot their system it's very likely that some process using the old
libpthread with HLE enabled will remain.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net


Reply to: