[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs



On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:
> Package: linux-image-2.6.32-5-amd64
> Version: 2.6.32-41squeeze2
> Severity: important
> 
> Hi,
> 
> since about December 2011 we've seen systems were SSH sessions suddenly hang and
> further logins on the physical TTY or via SSH are no longer possible. In some cases
> ssh logins still work and you see motd and mayeb can even issue one or two commands.
> (I've brought this issue up on debian-user in march with a private reply from a
> fellow DD yesterday http://lists.debian.org/debian-user/2012/03/msg01204.html)
> 
> 
> Over time we observed that ssh logins without PTS (ssh -T) still work. Looking at
> other sessions sshd was in state and D entries in /dev/pts/ were created correctly.
> Searching through munin graphs we could narrow down the starting point of this issue
> to the point when the hpet interrupts for one CPU core multiplied. Sometimes they
> multiplied by six. Looking further we've found the Kernel [events/$x] in state D
> where $x is the number of the CPU core which has the high number of hpet interrupts.
>
> When we started strace -f on the sshd master process everything works until you logout.
> Then you'll again see the forked sshd process hanging in state D.

This is strange, because D state means uninterruptible sleep (not
handling signals).  But perhaps the sshd process was repeatedly changing
between uninterruptible and interruptible state.

> Up to that point we've seen this issue exclusively on Linux 2.6.32 based systems,
> most often on Debian/Squeeze and less often on Ubuntu 10.04 and once or twice on
> a RHEL 6.1 system.
> 
> Searching further I've seen references on a Dell PowerEdge mailinglist referencing
> RedHat BZ#750201 and Intel CPU errata number AAO67 for Nehalem (rapid C state switching).
> The RedHat bug is currently non-public but through our technical contact at RedHat I was
> able to receive a summary of this bug and other referenced bugs which describe more or
> less exactly our issue.
> 
> According to RedHat that should be fixed in their Kernel 2.6.32-220.7.1.el6
> citing the following in the changelog:
> - [x86] hpet: Disable per-cpu hpet timer if ARAT is supported (Prarit Bhargava) [772884 750201]

This is also in Linux 2.6.32.30 and therefore in Debian's version
2.6.32-31.

> - [x86] Improve TSC calibration using a delayed workqueue (Prarit Bhargava) [772884 750201]
> - [kernel] clocksource: Add clocksource_register_hz/khz interface (Prarit Bhargava) [772884 750201]
> - [kernel] clocksource: Provide a generic mult/shift factor calculation (Prarit Bhargava) [772884 750201]

These are not in Linux 2.6.32-longterm or Debian stable.

> (Maybe that helps to track down the relevant changes.)

Based on those summary lines, I think the upstream changes are:

commit 08ec0c58fb8a05d3191d5cb6f5d6f81adb419798 (v2.6.38-rc1~480^2~1)
Author: John Stultz <johnstul@us.ibm.com>
Date:   Tue Jul 27 17:00:00 2010 -0700

    x86: Improve TSC calibration using a delayed workqueue

commit f12a15be63d1de9a35971f35f06b73088fa25c3a (v2.6.36-rc1~514^2~4)
Author: John Stultz <johnstul@us.ibm.com>
Date:   Tue Jul 13 17:56:27 2010 -0700

    x86: Convert common clocksources to use clocksource_register_hz/khz

commit 7d2f944a2b836c69a9d260a0a5f0d1720d57fdff (v2.6.33-rc1~363^2~12)
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Nov 11 14:05:29 2009 +0000

    clocksource: Provide a generic mult/shift factor calculation

The latter two are dependencies for the first, which is presumably the
really important change.  At a guess, better TSC calibration helps us to
avoid switching to the HPET.  John, do you know whether your changes
have that effect?

> As a workaround it could work to disable C-states in the BIOS or on the Kernel commandline
> with intel_idle.max_cstate=0 processor.max_cstate=1.
> Since we run into that issue only from time to time on the same system we could not yet
> verify either workaround. Rumours indicate that sometimes disabling it in the BIOS did
> not help because the Kernel enabled C-states again.
> 
> My current guess is that it's somehow related to the Intel Nehalem CPU bug and only happens
> if you have a high single threaded load which leads to one or core cores are switched into
> a C-6 sleep state so that they can overclock one core. Marketing name is TurboBoost.
> 
> Regarding the CPUs I know this happens with:
> - Intel X3430
> - Intel X3450
> - Intel L3426
> 
> We see it in almost all cases on Dell R210 with the X3430 CPUs.
> Rumours claim it also happens with other Dell models based on other CPUs from the
> Intel Nehalem series with TurboBoost. 
> 
> 
> Would be great if someone could track down the needed changes and incorporate those
> into a point release. In general I would be available for testing but we still have
> no way reproduce it beside waiting a few month. :(

This really isn't a viable means of testing a bug fix.

So we're really going to need to get some sort of explanation of how
these fix an important bug (whether or not it's actually the bug you've
run into).

Ben.

-- 
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: