[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: hdparm -t yields incorrect timings when Intel hyperthreading is enabled



On Mon, 05 May 2014, Paul Ausbeck wrote:
> I've attached the contents of /proc/cpuinfo below, two copies, one
> with hyperthreading disabled and one enabled.

As I told you, the *very first thing* you must do is to make sure you're
using the latest firmware for your motherboard (*especially* the BIOS/EFI).
If you're not, update it.  This bug reeks of a firmware issue.

cpuinfo looks normal for both cases, and the microcode is newer than
anything Intel ever published to the general public.

> I've also investigated things a bit further and now I'm thinking
> that the hyperthreading state affects the system as a whole, not
> just hdparm.

That's expected.

> First, I've attached hdparm output from the same machine booting to
> Windows 7. The reported disk speed is not affected by the
> hyperthreading state.  I've also attached boot speed measurements
> for the two states. Windows 7 boot time with hyperthreading enabled
> is 2/3 that when disabled. This would be expected if hyperthreading
> is actually worth anything.
> 
> Second, it turns out that the boot speed of linux is either
> unaffected by the state of hyperthreading, 3.2 kernel, or adversely
> affected by enabling hyperthreading, 3.12 kernel. I've attached

I believe you will need to take this to LKML, unfortunately.  One
information that will help track down the issue, is to try several kernel
versions in order to try to pinpoint better when things went bad.

LKML: linux-kernel mailing list.

> I'm thinking that the hdparm scenario is a good canary for a more
> fundamental problem with hyperthreading, at least on my dn2800mt
> machine. Perhaps the backports 3.12 kernel hasn't been fully vetted

Yes.  It makes it trivial to "reproduce the bug", so it would help tracking
the issue down immensely.

But you'll still need to do it with help from the LKML people, unless you
can handle the git bissecting yourself.


About git bissect (guides):
https://www.kernel.org/pub/software/scm/git/docs/git-bisect-lk2009.html

You can do this:
git bissect start

git bissect good <tag for good kernel version>
git bissect bad <tag for bad kernel version>

repeat git bissect good/bad as required to enter all datapoints you alread
have or manually tested.

You can move to any kernel version you want with "git reset --hard <tag>",
compile, test, and then mark it with "git bissect good" or "git bissect
bad".   git bissect will offer you a new test point when you do that.

Hint: when bissecting, for safety, first you should test and mark as "GOOD"
or "BAD" released/stable kernels, i.e. v3.12.8, v3.11.5, etc.  See above,
use "git reset --hard" to move to different kernel versions, recompile,
boot, test, "git bissect good"/"git bissect bad", rinse and repeat.  Try to
use a binary search pattern, to reduce the number of kernels you will have
to test.

Only after you got reasonably near the issue using the above, should you let
"git bissect" choose the test point, because it will usually land you
somewhere deep into the release-candidate kernels (or even worse, inside the
merge window), and those can be quite broken.

Therefore, also for safety, when testing these kernels boot to single-user
mode, run the hdparm test, note down what happened in a paper somewhere, and
reboot to a known release/stable kernel.  Only do any real work (such as the
git bissect stuff, compiling, etc) on a safe, known release/stable kernel.

Obviously, test single-user mode in your known release/stable kernel first,
just to make sure the bug doesn't disappear (or always appear) in
single-user mode :)

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


Reply to: