[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs



On Thu, 2012-04-26 at 10:02 +0200, Sven Hoexter wrote:
> On Thu, Apr 26, 2012 at 04:49:56AM +0100, Ben Hutchings wrote:
> > On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:
> 
> Hi,
> 
> > > Searching through munin graphs we could narrow down the starting point of this issue
> > > to the point when the hpet interrupts for one CPU core multiplied. Sometimes they
> > > multiplied by six. Looking further we've found the Kernel [events/$x] in state D
> > > where $x is the number of the CPU core which has the high number of hpet interrupts.
> > >
> > > When we started strace -f on the sshd master process everything works until you logout.
> > > Then you'll again see the forked sshd process hanging in state D.
> > 
> > This is strange, because D state means uninterruptible sleep (not
> > handling signals).  But perhaps the sshd process was repeatedly changing
> > between uninterruptible and interruptible state.
> 
> Is it possible to gather such data? I guess grep'ing through ps output
> is not the right tool here.
> 
> From a system currently suffering from this issue:
[...]

You can use 'echo w > /proc/sysrq-trigger' to get a traceback for all
the tasks in D state, which might provide some clues.

Ben.

-- 
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: