[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Server freezes sporadically



On 2009-09-05 06:59, Adrian Kirchner wrote:
Hi,

I have a small backup Server which runs on Debian 5.0 Lenny with all available stable updates.

For one month now this Server wasn't running for more than 24 hours because of problems which passes me by. I checked the syslog, messages and kernel log and the only anomaly I could find was a segfault entry, normaly immediately before the restart entry like this:

> Aug 10 06:33:16 LXL-CGN-01 kernel: [2041002.927632] perl[11972]: segfault at 157040 ip b7d563a2 sp bf803470 error 6 in libc-2.7.so[b7d40000+155000]

I thought it is a problem with libc and probably with the main memory. But the last few days, the server freezes without that messages. Now the last log entries look like this:

 > Sep  2 16:44:35 LXL-CGN-01 -- MARK --
 > Sep  2 17:04:35 LXL-CGN-01 -- MARK --
 > Sep  2 17:24:35 LXL-CGN-01 -- MARK --
> Sep 2 17:30:01 LXL-CGN-01 rsnapshot[17080]: /usr/bin/rsnapshot daily: completed successfully
 > Sep  2 17:44:35 LXL-CGN-01 -- MARK --
 > Sep  2 18:04:34 LXL-CGN-01 -- MARK --
 > Sep  2 18:24:34 LXL-CGN-01 -- MARK --
 > Sep  5 12:53:24 LXL-CGN-01 syslogd 1.5.0#5: restart.
> Sep 5 12:53:24 LXL-CGN-01 kernel: klogd 1.5.0#5, log source = /proc/kmsg started. > Sep 5 12:53:24 LXL-CGN-01 kernel: [ 0.000000] Initializing cgroup subsys cpuset > Sep 5 12:53:24 LXL-CGN-01 kernel: [ 0.000000] Initializing cgroup subsys cpu

This looks to me like: "Nothing to do and then freeze without a loggable reason". Does anybody has an idea what I can do to determine the problem?

The whole /var/log/messages can be found here: http://pastebin.com/f194c41e2 The whole /var/log/syslog can be found bzipped here: http://www.box.net/shared/azgde0yqfg

I'd run mentest over the weekend, and if that doesn't show any errors, boot from a Live CD and fsck the / (and, if separate, /usr) partitions.

Also, check if the filters are clogged, and install a sensor package to see if the CPUs are overheating.

BTW, you should really fix all those MTA errors, since all that redundant crud is hiding any other problems.

--
Brawndo's got what plants crave.  It's got electrolytes!


Reply to: