[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs



On 04/25/2012 08:49 PM, Ben Hutchings wrote:
On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:
Package: linux-image-2.6.32-5-amd64
Version: 2.6.32-41squeeze2
Severity: important

Hi,

since about December 2011 we've seen systems were SSH sessions suddenly hang and
further logins on the physical TTY or via SSH are no longer possible. In some cases
ssh logins still work and you see motd and mayeb can even issue one or two commands.
(I've brought this issue up on debian-user in march with a private reply from a
fellow DD yesterday http://lists.debian.org/debian-user/2012/03/msg01204.html)


Over time we observed that ssh logins without PTS (ssh -T) still work. Looking at
other sessions sshd was in state and D entries in /dev/pts/ were created correctly.
Searching through munin graphs we could narrow down the starting point of this issue
to the point when the hpet interrupts for one CPU core multiplied. Sometimes they
multiplied by six. Looking further we've found the Kernel [events/$x] in state D
where $x is the number of the CPU core which has the high number of hpet interrupts.

When we started strace -f on the sshd master process everything works until you logout.
Then you'll again see the forked sshd process hanging in state D.
This is strange, because D state means uninterruptible sleep (not
handling signals).  But perhaps the sshd process was repeatedly changing
between uninterruptible and interruptible state.

[snip]
Based on those summary lines, I think the upstream changes are:

commit 08ec0c58fb8a05d3191d5cb6f5d6f81adb419798 (v2.6.38-rc1~480^2~1)
Author: John Stultz<johnstul@us.ibm.com>
Date:   Tue Jul 27 17:00:00 2010 -0700

     x86: Improve TSC calibration using a delayed workqueue

commit f12a15be63d1de9a35971f35f06b73088fa25c3a (v2.6.36-rc1~514^2~4)
Author: John Stultz<johnstul@us.ibm.com>
Date:   Tue Jul 13 17:56:27 2010 -0700

     x86: Convert common clocksources to use clocksource_register_hz/khz

commit 7d2f944a2b836c69a9d260a0a5f0d1720d57fdff (v2.6.33-rc1~363^2~12)
Author: Thomas Gleixner<tglx@linutronix.de>
Date:   Wed Nov 11 14:05:29 2009 +0000

     clocksource: Provide a generic mult/shift factor calculation

The latter two are dependencies for the first, which is presumably the
really important change.  At a guess, better TSC calibration helps us to
avoid switching to the HPET.  John, do you know whether your changes
have that effect?
No, the TSC calibration just makes sure we get the same fine-grained cpukhz value every bootup, which avoids time skew. It should not affect if we switch to the HPET, which would be a sign of a unsynced or halting TSC.

When you can connect to the system that is having problems, do you see any problems with the time? ie: does date show the correct time, and does it increment normally?

It sounds like if there is some HPET irq issue, it would likely be due to some sort of global wakeup to handle local apics that halt in deep sleep modes. Its likely that getting /proc/timer_list output would help (both before and after the problem).

Also I'd try to bring in tglx for his thoughts.

thanks
-john




Reply to: