[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#309308: Similar kernel oops without the snmpwalk



I do not have any SNMP related software installed on my servers and the
servers are hidden behind a NATing load balancer (ServerIron) so, I am
reasonably sure that no such package could have been sent from outside.
Nevertheless I have just had a very similar crash resulting in server
going down. I am running sarge. Here is the relevant information,

Package: kernel-image-2.6.8-2-686-smp
Version: 2.6.8-16

Relevant excerpt from the logs:
Aug 23 17:12:00 server1 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000064
Aug 23 17:12:00 server1 kernel: printing eip:
Aug 23 17:12:00 server1 kernel: c0198f96
Aug 23 17:12:00 server1 kernel: *pde = 00000000
Aug 23 17:12:00 server1 kernel: Oops: 0000 [#1]
Aug 23 17:12:00 server1 kernel: PREEMPT SMP
Aug 23 17:12:00 server1 kernel: Modules linked in: nfs lockd sunrpc ipv6
serverworks sworks_agp agpgart ohci_hcd usbcore tg3 firmware_class tsdev
mousedev evdev dm_mod capability commoncap psmouse ide_generic ide_disk
ide_cd ide_core cdrom genrtc ext3 jbd mbcache cciss scsi_mod unix font
vesafb cfbcopyarea cfbimgblt cfbfillrect
Aug 23 17:12:00 server1 kernel: CPU:    2
Aug 23 17:12:00 server1 kernel: EIP:    0060:[<c0198f96>]    Not tainted
Aug 23 17:12:00 server1 kernel: EFLAGS: 00010286   (2.6.8-2-686-smp)
Aug 23 17:12:00 server1 kernel: EIP is at proc_pid_stat+0x1c6/0x770
Aug 23 17:12:00 server1 kernel: eax: 00000000   ebx: f7ece500   ecx:
e585e000   edx: cb2bce00
Aug 23 17:12:00 server1 kernel: esi: ef224000   edi: c035fe20   ebp:
f78901b0   esp: ef225e20
Aug 23 17:12:00 server1 kernel: ds: 007b   es: 007b   ss: 0068
Aug 23 17:12:00 server1 kernel: Process monit (pid: 3297,
threadinfo=ef224000 task=f78f4030)
Aug 23 17:12:00 server1 kernel: Stack: f78901b0 ef225f2c ef225f24
f7890f6a 00000053 00005e97 00005e97 00005e3b
Aug 23 17:12:00 server1 kernel: 00000000 ffffffff 00000100 000003cb
00000000 00000000 ef224000 d2805c90
Aug 23 17:12:00 server1 kernel: f78901b0 d2805c80 c0196224 f78901b0
d2805c90 d11bd170 ef225f60 f7fbb480
Aug 23 17:12:00 server1 kernel: Call Trace:
Aug 23 17:12:00 server1 kernel: [<c0196224>] pid_revalidate+0x64/0xf0
Aug 23 17:12:00 server1 kernel: [<c0179f11>] dput+0x31/0x270
Aug 23 17:12:00 server1 kernel: [<c017006e>] link_path_walk+0xc2e/0x1020
Aug 23 17:12:00 server1 kernel: [<c016dbde>] pipe_wait+0x7e/0xa0
Aug 23 17:12:00 server1 kernel: [<c0142c06>] buffered_rmqueue+0x116/0x230
Aug 23 17:12:00 server1 kernel: [<c016dbde>] pipe_wait+0x7e/0xa0
Aug 23 17:12:00 server1 kernel: [<c019555a>] proc_info_read+0x4a/0x120
Aug 23 17:12:00 server1 kernel: [<c015fc1d>] vfs_read+0xed/0x160
Aug 23 17:12:00 server1 kernel: [<c015fef1>] sys_read+0x51/0x80
Aug 23 17:12:00 server1 kernel: [<c01061fb>] syscall_call+0x7/0xb
Aug 23 17:12:00 server1 kernel: Code: 8b 50 64 8b 70 68 8b 41 08 c1 e2
14 09 f2 01 c2 89 d0 c1 e8
Aug 23 17:12:00 server1 kernel: <6>note: monit[3297] exited with
preempt_count 1

Since monit (service monitor daemon) seems to be relevant in this case
here is my monit config file:

set daemon 30
set logfile syslog facility log_daemon
set mailserver <mailserver>
set mail-format { from: <email> }
set alert <email>

set httpd port 2812
allow <user:passwd>

# start monitor descriptions

check process apache with pidfile /var/run/apache.pid
        start program = "/etc/init.d/apache start"
        stop program  = "/etc/init.d/apache stop"

        if failed host marvin port 80 protocol http request /test.php
timeout 5 seconds then alert
        if failed host marvin port 443 type tcpssl protocol http request
/test.php timeout 5 seconds then alert
        if cpu is greater than 90% for 5 cycles then alert
        if children > 300 for 5 cycles then alert
        if loadavg(5min) greater than 10 for 8 cycles then alert
        if 3 restarts within 5 cycles then timeout
        mode passive
        group server

check process sshd with pidfile /var/run/sshd.pid
        start program  "/etc/init.d/ssh start"
        stop program  "/etc/init.d/ssh stop"
        if failed port 22 protocol ssh then alert
        if 5 restarts within 5 cycles then timeout
        group server

check device root with path /dev/cciss/c0d0p1
        if space usage > 90% then alert
        mode passive





Reply to: