[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#548397: linux-image-2.6.30-2-686-bigmem: Random Lockup with automount, mysqld, and kswapd0 error messages.



Ben Hutchings wrote:
On Sat, 2009-09-26 at 05:41 -0400, Andres Salomon wrote:
On Fri, 25 Sep 2009 23:33:45 -0500
Paul Logasa Bogen II <plb@tamu.edu> wrote:
After a semi-random period of normal operation (anywhere from a few
hours to a week) the machine will suddenly get a series of page
allocation errors followed by a series of "soft lockup - CPU#[X]
stuck" messages after which the machine is completely non responsive
and has to be hard restarted.
Perhaps there's a resource leak of some type?  Can you try rebuilding
the kernels on these systems with CONFIG_DEBUG_KMEMLEAK?

This describes the process of inspecting for memory leaks:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/kmemleak.txt;h=34f6638aa5aceec30d290812fdc7fcebf3b86621;hb=HEAD

It would be useful to know the state of the system(s) prior to crash
(perhaps with a "while (sleep 600); do cat /sys/kernel/debug/kmemleak >
log; done" or something?)

I think it's probably best to send this one upstream.

Paul, please enter a bug report at bugzilla.kernel.org.  Set the product
to 'Memory management' and component to 'Other'.  Let us know the bug
number so we can keep track of it.  Attach the log from your original
bug report <http://bugs.debian.org/548397>.

Ben.

Just a heads up, the professors in my lab desperately needed stability restored as they are the middle of grant proposals and paper deadlines. I downgraded the effected machines to 2.6.26-2-686-bigmem and stability has returned. I can go ahead and send it up stream, but I'm not sure how much help I'll be if I'm no longer running the effected version.

plb



Reply to: