[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#800574: Final analysis for Broadwell



On Sun, 18 Oct 2015, Aurelien Jarno wrote:
> > Broadwell-H with a very recent microcode update (rev 0x12, from
> > 2015-06-04) was confirmed to have broken TSX-NI (RTM) and to _leave it
> > enabled_ in CPUID, causing glibc with lock elision enabled to SIGSEGV. 
> > An even more recent Broadwell-H microcode update, rev 0x13 from
> > 2015-08-03, is confirmed to (finally) disable the HLE and RTM CPUID
> > bits.  This should make blacklisting signature 0x40671 uncontroversial.

FWIW, in the last few days it became clear that so far, the mobile
Broadwell-H disables Intel TSX, but no instances of the desktop
Broadwell-H with RTM disabled were found yet, not even with the latest
microcode.  And they all use the same microcode.

It has also became clear a few days ago that it is very likely that the
BIOS can disable Intel TSX-NI (RTM) and HLE, and it doesn't need very
recent microcode to do that either.  If this is true, it should be
something like MSR 0x13c (bit 1 of that MSR disables AES-NI when set,
and bit 0 locks that MSR against writting when set).  Maybe the Intel
TSX-NI (HLE and RTM) disable switches are even on this very same MSR...

I've also since became aware of Debian bug #750792, and it describes the
same SIGSEGV observed by Broadwell and Skylake Arch-linux users on
lock-elision-enabled glibc.  From that bug report, it is clear that the
SIGSEGVs in __lll_unlock_elision can easily happen due to software bugs,
so it need not be linked to any Intel-TSX processor errata.  And this
kind of defect is quite common, apparently.

However, since Intel's current public specification update states (as
errata) that Intel TSX-NI is not supposed to be usable in the Broadwell
and Broadwell-H cores, that it should not even be reported in CPUID by
these processors (but it is :p), and that this is not supposed to be
fixable or worked around, I still think we need to blacklist it.

I will keep tracking this issue, and report back any relevant
information that becomes available.  It would be _really_ nice if the
Intel team that works with Canonical were to shed a light on this,
though.

> Thanks for the patch, I have committed it to the jessie and the 2.21
> branches.

Thank you.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


Reply to: