[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

tracking down kernel-panic cause?



This morning, I discovered to my great astonishment that my machine had
kernel-panicked in the middle of the night.

My main question is, how do you go about tracking something like this
down?  What utilities should I use to make sure my hardware is properly
configured? Etc?  If it happens again, should I write down all the
output from the panic?

On to the specifics: does anyone have any idea what might have happened?

The last time this machine was rebooted was about a month ago, to add a
USB 2 card and an internal drive.  I also recompiled my kernel at that
time to add USB 2 support.  I may have made other changes; don't quite
remember.

Anyway, it's been chugging along quite happily since then, up until the
aforementioned kernel panic.

I rebooted and, a couple of minutes after the reboot, it panicked again,
implicating exim.

I rebooted again and it (meaning lilo? very early) complained of a bad
CRC check.

I rebooted it again and, a few seconds after finishing all startup
scripts, it panicked, implicating a totally different app.

I rebooted it again and wrangled it into single-user mode.  I don't know
whether it is significant that it didn't panick while I was in
single-user mode for half an hour or so.  I'd been running a kernel from
the 2.4.21-4 source; 2.4.21-5 was available, so I recompiled the kernel
using 2.4.21-5 and "make oldconfig".   Since I've rebooted with that
kernel, several hours ago, it's been well-behaved ... but geez, I'd
really like to track down the source of the problem, since the machine
was well-behaved for a month before having this problem, and for an
extremely long time before that.  Anyway, I have no way to know that
whatever caused these kernel panics won't happen again.

The only really odd thing I've noticed is the following in my boot log:
Mon Oct  6 10:27:52 2003: Starting SpamAssassin Mail Filter Daemon:
named-xfer[408]: can't make tmpfile (stub/fgm.com.fuRBwu): Inappropriate
ioctl for device

That's weird for a few reasons.
1) What does bind have to do with SA?
2) Inappropriate ioctl? Huh?

Hrm, relevant data ...

My mount output:
/dev/hda4 on / type auto (rw,errors=remount-ro)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
/dev/hda3 on /var type ext3 (rw)
/dev/hdc1 on /var/mp3 type ext3 (rw)
usbfs on /proc/bus/usb type usbfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

My boot log:

Mon Oct  6 10:27:31 2003: bootlogd.
Mon Oct  6 10:27:31 2003: Activating swap.
Mon Oct  6 10:27:31 2003: Checking root file system...
Mon Oct  6 10:27:31 2003: fsck 1.35-WIP (21-Aug-2003)
Mon Oct  6 10:27:31 2003: /dev/hda4: clean, 185110/1909440 files, 1080140/3814902 blocks
Mon Oct  6 10:27:31 2003: modprobe: modprobe: Can't locate module char-major-10-135
Mon Oct  6 10:27:31 2003: System time was Mon Oct  6 16:27:28 UTC 2003.
Mon Oct  6 10:27:31 2003: Setting the System Clock using the Hardware Clock as reference...
Mon Oct  6 10:27:31 2003: modprobe: modprobe: Can't locate module char-major-10-135
Mon Oct  6 10:27:31 2003: modprobe: modprobe: Can't locate module char-major-10-135
Mon Oct  6 10:27:31 2003: System Clock set. System local time is now Mon Oct  6 16:27:30 UTC 2003.
Mon Oct  6 10:27:31 2003: Calculating module dependencies... done.
Mon Oct  6 10:27:31 2003: Loading modules: sis900 modprobe: Can't locate module ethernet
Mon Oct  6 10:27:31 2003: 
Mon Oct  6 10:27:31 2003: Checking all file systems...
Mon Oct  6 10:27:31 2003: fsck 1.35-WIP (21-Aug-2003)
Mon Oct  6 10:27:31 2003: /dev/hda1: clean, 38/16384 files, 9231/32728 blocks
Mon Oct  6 10:27:31 2003: /dev/hda3: clean, 14464/524288 files, 109897/1048572 blocks
Mon Oct  6 10:27:31 2003: /dev/hdc1: clean, 5371/10010624 files, 9661060/20010808 blocks
Mon Oct  6 10:27:31 2003: Setting kernel variables..
Mon Oct  6 10:27:31 2003: Mounting local filesystems...
Mon Oct  6 10:27:31 2003: mount: none already mounted or /dev/pts busy
Mon Oct  6 10:27:31 2003: mount: according to mtab, devpts is already mounted on /dev/pts
Mon Oct  6 10:27:31 2003: /dev/hda1 on /boot type ext3 (rw)
Mon Oct  6 10:27:31 2003: /dev/hda3 on /var type ext3 (rw)
Mon Oct  6 10:27:31 2003: /dev/hdc1 on /var/mp3 type ext3 (rw)
Mon Oct  6 10:27:31 2003: Starting hotplug subsystem: usbsync:[001  001  001  001  001  ] .
Mon Oct  6 10:27:49 2003: Cleaning: /etc/network/ifstate.
Mon Oct  6 10:27:49 2003: Setting up IP spoofing protection: rp_filter.
Mon Oct  6 10:27:49 2003: Configuring network interfaces... Starting mail retrieval agent: system-wide fetchmail not configured.
Mon Oct  6 10:27:49 2003: Starting mail retrieval agent: system-wide fetchmail not configured.
Mon Oct  6 10:27:49 2003: done.
Mon Oct  6 10:27:49 2003: Portmap disabled per monique
Mon Oct  6 10:27:49 2003: Loading the saved-state of the serial devices... 
Mon Oct  6 10:27:50 2003: /dev/ttyS0 at 0x03f8 (irq = 4) is a 16550A
Mon Oct  6 10:27:50 2003: /dev/ttyS1 at 0x02f8 (irq = 3) is a 16550A
Mon Oct  6 10:27:50 2003: ^[[9;30]^[[14;30]
Mon Oct  6 10:27:50 2003: Setting the System Clock using the Hardware Clock as reference...
Mon Oct  6 10:27:50 2003: modprobe: modprobe: Can't locate module char-major-10-135
Mon Oct  6 10:27:51 2003: System Clock set. Local time: Mon Oct  6 10:27:51 MDT 2003
Mon Oct  6 10:27:51 2003: 
Mon Oct  6 10:27:51 2003: Cleaning: /tmp /var/lock /var/run.
Mon Oct  6 10:27:51 2003: Initializing random number generator... done.
Mon Oct  6 10:27:51 2003: Recovering nvi editor sessions... done.
Mon Oct  6 10:27:51 2003: Setting up X server socket directory /tmp/.X11-unix...done.
Mon Oct  6 10:27:51 2003: INIT: Entering runlevel: 2
Mon Oct  6 10:27:52 2003: Starting hotplug subsystem:.... already started. process pending events.
Mon Oct  6 10:27:52 2003: Starting kernel log daemon: klogd.
Mon Oct  6 10:27:52 2003: exiting kerneld
Mon Oct  6 10:27:52 2003: Starting domain name service: namednamed[385]: starting (/etc/bind/named.conf).  named 8.4.1-P1-REL-NOESW Tue Aug 26 17:32:39 MDT 2003
Mon Oct  6 10:27:52 2003: 	lamont@whatone:/usr/local/src/Packages/bind/bind-8.4.1.0/src/bin/named
Mon Oct  6 10:27:52 2003: named[385]: hint zone "" (IN) loaded (serial 0)
Mon Oct  6 10:27:52 2003: named[385]: master zone "localhost" (IN) loaded (serial 1)
Mon Oct  6 10:27:52 2003: named[385]: master zone "127.in-addr.arpa" (IN) loaded (serial 1)
Mon Oct  6 10:27:52 2003: named[385]: master zone "0.in-addr.arpa" (IN) loaded (serial 1)
Mon Oct  6 10:27:52 2003: named[385]: master zone "255.in-addr.arpa" (IN) loaded (serial 1)
Mon Oct  6 10:27:52 2003: named[385]: listening on [127.0.0.1].53 (lo)
Mon Oct  6 10:27:52 2003: named[385]: listening on [192.168.1.102].53 (eth0)
Mon Oct  6 10:27:52 2003: named[385]: AF_INET6: address family not supported
Mon Oct  6 10:27:52 2003: named[385]: Forwarding source address is [0.0.0.0].53
Mon Oct  6 10:27:52 2003: .
Mon Oct  6 10:27:52 2003: named[405]: Ready to answer queries.
Mon Oct  6 10:27:52 2003: Starting SpamAssassin Mail Filter Daemon: named-xfer[408]: can't make tmpfile (stub/fgm.com.fuRBwu): Inappropriate ioctl for device
Mon Oct  6 10:27:53 2003: 
Mon Oct  6 10:27:57 2003: unix dgram connect: No such file or directory at /usr/sbin/spamd line 281
Mon Oct  6 10:27:57 2003: spamd.
Mon Oct  6 10:27:58 2003: WARN: Bastille-firewall is not configured yet
Mon Oct  6 10:27:58 2003: Please create /etc/Bastille/bastille-firewall.cfg to enable it.
Mon Oct  6 10:27:58 2003: (hint: use InteractiveBastille)
Mon Oct  6 10:27:58 2003: Enabling additional executable binary formats: binfmt-support.
Mon Oct  6 10:27:58 2003: 
Mon Oct  6 10:27:58 2003: WARNING: portmapper inactive - RPC services unavailable!
Mon Oct  6 10:27:58 2003:          (Commenting out the rpc services in inetd.conf will
Mon Oct  6 10:27:58 2003:          disable this message)
Mon Oct  6 10:27:58 2003: 
Mon Oct  6 10:27:58 2003: Starting internet superserver: inetd.
Mon Oct  6 10:27:58 2003: Starting internet superserver: inetd.
Mon Oct  6 10:27:58 2003: Starting system log daemon: syslogd.
Mon Oct  6 10:27:59 2003: Starting printer spooler: lprng.
Mon Oct  6 10:28:00 2003: Starting Samba daemons: nmbd smbd.
Mon Oct  6 10:28:03 2003: Starting slimp3: slimp3.
Mon Oct  6 10:28:05 2003: Starting OpenBSD Secure Shell server: sshd.
Mon Oct  6 10:28:06 2003: Running ntpdate to synchronize clock.
Mon Oct  6 10:28:07 2003: Starting NTP server: ntpd.
Mon Oct  6 10:28:07 2003: Starting deferred execution scheduler: atd.
Mon Oct  6 10:28:07 2003: Starting periodic command scheduler: cron.
Mon Oct  6 10:28:07 2003: Started folding.home client
Mon Oct  6 10:28:08 2003: Starting web server: apache.
Mon Oct  6 10:28:11 2003: Starting mail retrieval agent: system-wide fetchmail not configured.
Mon Oct  6 10:28:11 2003: Stopping Bootlog daemon: 

My dmesg:
Linux version 2.4.21 (root@home.bounceswoosh.org) (gcc version 3.3.2 20030908 (Debian prerelease)) #2 Mon Oct 6 10:24:49 MDT 2003
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
 BIOS-e820: 000000001fff0000 - 000000001fff8000 (ACPI data)
 BIOS-e820: 000000001fff8000 - 0000000020000000 (ACPI NVS)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved)
 BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
511MB LOWMEM available.
On node 0 totalpages: 131056
zone(0): 4096 pages.
zone(1): 126960 pages.
zone(2): 0 pages.
Kernel command line: auto BOOT_IMAGE=linux ro root=304
Initializing CPU#0
Detected 1659.627 MHz processor.
Console: colour VGA+ 80x43
Calibrating delay loop... 3316.12 BogoMIPS
Memory: 516000k/524224k available (1358k kernel code, 7836k reserved, 505k data, 104k init, 0k highmem)
Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
Inode cache hash table entries: 32768 (order: 6, 262144 bytes)
Mount cache hash table entries: 512 (order: 0, 4096 bytes)
Buffer-cache hash table entries: 32768 (order: 5, 131072 bytes)
Page-cache hash table entries: 131072 (order: 7, 524288 bytes)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU:     After generic, caps: 0383f9ff c1c3f9ff 00000000 00000000
CPU:             Common caps: 0383f9ff c1c3f9ff 00000000 00000000
CPU: AMD Athlon(tm) XP 2000+ stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
PCI: PCI BIOS revision 2.10 entry at 0xfdb01, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router SIS [1039/0008] at 00:02.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
Starting kswapd
Journalled Block Device driver loaded
pty: 256 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI ISAPNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
sis900.c: v1.08.06 9/24/2002
PCI: Assigned IRQ 11 for device 00:03.0
eth0: Realtek RTL8201 PHY transceiver found at address 1.
eth0: Using transceiver found at address 1 as default
eth0: SiS 900 PCI Fast Ethernet at 0xd400, IRQ 11, 00:0a:e6:52:7e:e7.
Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory: 439M
agpgart: Detected SiS 735 chipset
agpgart: AGP aperture is 64M @ 0xd0000000
Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
hda: Maxtor 5T020H2, ATA DISK drive
hdb: CREATIVE CD2423E, ATAPI CD/DVD-ROM drive
hdc: Maxtor 6Y080P0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: attached ide-disk driver.
hda: host protected area => 1
hda: 40021632 sectors (20491 MB) w/2048KiB Cache, CHS=2491/255/63
hdc: attached ide-disk driver.
hdc: host protected area => 1
hdc: 160086528 sectors (81964 MB) w/7936KiB Cache, CHS=158816/16/63
hdb: attached ide-cdrom driver.
hdb: ATAPI 20X CD-ROM drive, 382kB Cache
Uniform CD-ROM driver Revision: 3.12
Partition check:
 hda: hda1 hda2 hda3 hda4
 hdc: hdc1
SCSI subsystem driver Revision: 1.00
kmod: failed to exec /sbin/modprobe -s -k scsi_hostadapter, errno = 2
kmod: failed to exec /sbin/modprobe -s -k scsi_hostadapter, errno = 2
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
PCI: Found IRQ 5 for device 00:0d.2
ehci-hcd 00:0d.2: NEC Corporation USB 2.0
ehci-hcd 00:0d.2: irq 5, pci mem e0816f00
usb.c: new USB bus registered, assigned bus number 1
PCI: 00:0d.2 PCI cache line size set incorrectly (32 bytes) by BIOS/FW.
PCI: 00:0d.2 PCI cache line size corrected to 64.
ehci-hcd 00:0d.2: USB 2.0 enabled, EHCI 0.95, driver 2003-Jan-22
hub.c: USB hub found
hub.c: 5 ports detected
host/usb-uhci.c: $Revision: 1.275 $ time 10:24:55 Oct  6 2003
host/usb-uhci.c: High bandwidth mode enabled
host/usb-uhci.c: v1.275:USB Universal Host Controller Interface driver
Initializing USB Mass Storage driver...
usb.c: registered new driver usb-storage
USB Mass Storage support registered.
NET4: Linux TCP/IP 1.0 for NET4.0
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 65536)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 104k freed
Adding Swap: 524152k swap-space (priority -1)
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,4), internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,3), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SiS router pirq escape (99)
SiS router pirq escape (99)
usb-ohci.c: USB OHCI at membase 0xe085f000, IRQ 12
usb-ohci.c: usb-00:02.3, Silicon Integrated Systems [SiS] 7001 (#2)
usb.c: new USB bus registered, assigned bus number 2
hub.c: USB hub found
hub.c: 3 ports detected
PCI: Found IRQ 5 for device 00:02.2
PCI: Sharing IRQ 5 with 00:0d.1
usb-ohci.c: USB OHCI at membase 0xe0861000, IRQ 5
usb-ohci.c: usb-00:02.2, Silicon Integrated Systems [SiS] 7001
usb.c: new USB bus registered, assigned bus number 3
hub.c: USB hub found
hub.c: 3 ports detected
PCI: Found IRQ 5 for device 00:0d.1
PCI: Sharing IRQ 5 with 00:02.2
usb-ohci.c: USB OHCI at membase 0xe0863000, IRQ 5
usb-ohci.c: usb-00:0d.1, NEC Corporation USB (#2)
usb.c: new USB bus registered, assigned bus number 4
hub.c: USB hub found
hub.c: 2 ports detected
PCI: Found IRQ 11 for device 00:0d.0
PCI: Sharing IRQ 11 with 00:02.7
usb-ohci.c: USB OHCI at membase 0xe0865000, IRQ 11
usb-ohci.c: usb-00:0d.0, NEC Corporation USB
usb.c: new USB bus registered, assigned bus number 5
hub.c: USB hub found
hub.c: 3 ports detected
eth0: Media Link On 100mbps full-duplex 

-- 
monique
Please respond to the group OR to my email, but not both.  (Group preferred.)



Reply to: