[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#517449: linux-image-2.6.26-2-vserver-amd64: still getting "task <task:pid> blocked for more than 120 seconds." kernel log messages



On Fri, 2010-03-26 at 08:35 +0100, Timo Veith wrote:
> Hi all,
> 
> sorry if my message from yesterday was noise too, but I haven't
> understood yet why those kernel log messages appear anyway. Somebody in
> this bug report pointed out that those "INFO: task xy blocked for more
> than 120 seconds." are only symptoms of the real problem. Could somebody
> explain the actual problem to me please?

The problem is that a task has been in uninterruptible sleep (aka
IO-wait) for a long time, which probably means it is deadlocked due to a
lock imbalance or incorrect lock ordering.  The warning should point to
which lock is involved, but that does not tell us where the buggy code
is.

> What I have learned from this and from other similar bug reports like
> this yet is, that it has something to do with the load which is on the
> machine. But that is not clear enough to me.

As the load increases there is likely to be greater lock contention and
more chance of a bug in lock ordering actually causing deadlock.

(There were also some bugs in tickless (aka 'NOHZ') scheduling which
could also lead to this warning, but they have been fixed.  Those bugs
mostly affected very idle machines.)

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: