[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

amazingly, debian server keeps crashing



debian intel 845 chipset pentium 4, 400 fsb.
3ware sata 4-hd raid 5.


we're having trouble finding evidence anywhere
that shows what's going south--

from "kern.log":

Apr 12 19:53:11 server kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000008c
Apr 12 19:53:11 server kernel:  printing eip:
Apr 12 19:53:11 server kernel: f8899f39
Apr 12 19:53:11 server kernel: *pde = 00000000
Apr 12 19:53:11 server kernel: Oops: 0000 [#1]
Apr 12 19:53:11 server kernel: PREEMPT
Apr 12 19:53:11 server kernel: Modules linked in: ipv6 genrtc dm_mod capability commoncap 3c59x usbkbd usbcore ext3 jbd mbcache sd_mod 3w_xxxx scsi_mod unix font vesafb cfbcopyarea cfbimgblt cfbfillrect
Apr 12 19:53:11 server kernel: CPU:    0
Apr 12 19:53:11 server kernel: EIP:    0060:[__crc_xfrm_state_alloc+4074054/4557196]    Not tainted
Apr 12 19:53:11 server kernel: EFLAGS: 00010292   (2.6.8-2-686)
Apr 12 19:53:11 server kernel: EIP is at journal_blocks_per_page+0x9/0x20 [jbd]
Apr 12 19:53:11 server kernel: eax: 00000000   ebx: 00000000   ecx: 0000000c   edx: f88e66a0
Apr 12 19:53:11 server kernel: esi: 00000000   edi: 00000000   ebp: c14da860   esp: e6e91d88
Apr 12 19:53:11 server kernel: ds: 007b   es: 007b   ss: 0068
Apr 12 19:53:11 server kernel: Process exim3 (pid: 18709, threadinfo=e6e90000 task=ec984130)
Apr 12 19:53:11 server kernel: Stack: f88cf533 00000000 00000231 f88cca7a 00000000 f6085314 400180c3 00000018
Apr 12 19:53:11 server kernel:        f60853b0 c0135f9c f60853b0 00000018 00000231 00000018 c14da860 0000016e
Apr 12 19:53:11 server kernel:        c0137d19 ee7f5680 c14da860 0000016e 00000231 c18e0230 00000000 00000065
Apr 12 19:53:11 server kernel: Call Trace:
Apr 12 19:53:11 server kernel:  [__crc_xfrm_state_alloc+4292672/4557196] ext3_writepage_trans_blocks+0x13/0x80 [ext3]
Apr 12 19:53:11 server kernel:  [__crc_xfrm_state_alloc+4281735/4557196] ext3_prepare_write+0x1a/0x140 [ext3]
Apr 12 19:53:11 server kernel:  [find_lock_page+44/224] find_lock_page+0x2c/0xe0
Apr 12 19:53:11 server kernel:  [generic_file_aio_write_nolock+969/2912] generic_file_aio_write_nolock+0x3c9/0xb60
Apr 12 19:53:11 server kernel:  [do_anonymous_page+312/432] do_anonymous_page+0x138/0x1b0
Apr 12 19:53:11 server kernel:  [do_no_page+96/848] do_no_page+0x60/0x350
Apr 12 19:53:11 server kernel:  [generic_file_aio_write+120/176] generic_file_aio_write+0x78/0xb0
Apr 12 19:53:11 server kernel:  [__crc_xfrm_state_alloc+4270833/4557196] ext3_file_write+0x44/0xd0 [ext3]
Apr 12 19:53:11 server kernel:  [do_sync_write+128/176] do_sync_write+0x80/0xb0
Apr 12 19:53:11 server kernel:  [sys_wait4+459/640] sys_wait4+0x1cb/0x280
Apr 12 19:53:11 server kernel:  [vfs_write+237/352] vfs_write+0xed/0x160
Apr 12 19:53:11 server kernel:  [sys_write+81/128] sys_write+0x51/0x80
Apr 12 19:53:11 server kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Apr 12 19:53:11 server kernel: Code: 8b 80 8c 00 00 00 0f b6 40 14 29 c1 b8 01 00 00 00 d3 e0 c3

the items PRIOR to this set of entries is:

Apr 11 14:22:22 server kernel: Generic RTC Driver v1.07
Apr 11 14:22:22 server kernel: NET: Registered protocol family 10
Apr 11 14:22:22 server kernel: Disabled Privacy Extensions on device c02ff020(lo)
Apr 11 14:22:22 server kernel: IPv6 over IPv4 tunneling driver
Apr 11 14:22:33 server kernel: eth0: no IPv6 routers present
Apr 11 14:31:56 server kernel: 3w-xxxx: scsi0: AEN: INFO: Initialization started: Unit #0.
Apr 11 17:16:02 server kernel: 3w-xxxx: scsi0: AEN: INFO: Initialization complete: Unit #0.

there were NO entries in "kern.log" between 5:53pm 12 apr and 7:53pm 11 apr.
but "messages" shows:

Apr 13 05:42:24 server -- MARK --
Apr 13 06:02:24 server -- MARK --
Apr 13 06:22:24 server -- MARK --
Apr 13 06:25:03 server syslogd 1.4.1#16: restart.
Apr 13 06:42:24 server -- MARK --
Apr 13 07:02:24 server -- MARK --
Apr 18 14:27:05 server syslogd 1.4.1#16: restart.

so it apparently died between 7:02 and 7:22 on the 13th.



$ uname -a
Linux server 2.6.8-2-686 #1 Mon Jan 24 03:58:38 EST 2005 i686 GNU/Linux
# cat /proc/version
Linux version 2.6.8-2-686 (dilinger@toaster.hq.voxel.net) (gcc version 3.3.5 (Debian 1:3.3.5-6)) #1 Mon Jan 24 03:58:38 EST 2005

should we be using kernel 2.4? other pointers welcome...



# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping        : 9
cpu MHz         : 2857.073
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid
bogomips        : 5652.48




Reply to: