[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

e2fsck+raid+linux 2.2.16 = crash?



Today...

One of my systems crashed due to SCSI disk errors.

when it came back up it crashed again. so i had to go to the site (2 hour
drive!@) and took a look at it.

it seems when the system tries to run e2fsck on the raid set the system
crashes with the message:

Jul 14 13:36:30 galactica kernel: VFS: grow_buffers: size = 16384
Jul 14 13:36:31 galactica last message repeated 665 times
Jul 14 13:36:31 galactica kernel: 384
Jul 14 13:36:31 galactica kernel: VFS: grow_buffers: size = 16384
Jul 14 13:36:31 galactica last message repeated 814 times
Jul 14 13:36:31 galactica kernel: 384
Jul 14 13:36:31 galactica kernel: VFS: grow_buffers: size = 16384
Jul 14 13:36:31 galactica last message repeated 468 times             

it goes on and on and on..(the people on site left it in that state for a
good hour and a half until i got there)

This did not happen with 2.2.15. This is also the first time i have run
e2fsck on the raid drive since upgrading to 2.2.16. As a result i have
been forced to mount the raid array unclean just so i could get the system
up and running so the rest of the staff at the site could go home(its
co-located). I didn't think to reboot to 2.2.15 and run e2fsck on it
before i left i was in a real rush.

System configuration:

Asus P2L97-DS (this board does not have scsi despite the "DS" model code)
Dual P2-233
128MB PC100 SDRAM x 3 (384MB total)
Matrox G100 AGP Video
Adaptec AHA 2940UW PCI
3Com 3C905C w/drivers from www.3com.com
Generic ATAPI 32x CDROM
Root device: IBM DDRS-34560D 4.3GB
Raid system: IBM-DNES-30917OW 9.1GB (x2) Software raid mode 1
Additional Storage: QUANTUM VIKING 4.5WSE 4.5GB (<- the cause of the
initial crash)


Linux Distribution: Debian GNU/Linux 2.1r4
Installed: Mid-march
Installed with kernel: 2.2.14+ow1 (www.openwall.com/linux)
Currently running: 2.2.16+ow1

Raid version: 0.36.4

this happened again and again, and i finally traced it down to this by
disabling automounting of /dev/md0 in /etc/fstab and the system booted
fine.  ckraid was run multiple times with no crash(it ran everytime the
system crashed due to other reasons).  Once i tried to run e2fsck on
/dev/md0 the errors(above) flooded the screen until i ctrl-alt-del.

I'm not a kernel hacker but i was curious if this is a known issue? i do
read kernel traffic every week but i haven't seen anything specifically
realted to raid mentioned (that i can remember). I think what i will do is
download the data off the raid set, and reformat it so it can be
"clean" again. or reboot to 2.2.15 and run e2fsck on it..

i can try to provide more info if needed, let me know, i just dont want to
have to drive up there again if i can avoid it:)

help!

nate

:::
http://www.aphroland.org/
http://www.linuxpowered.net/
aphro@aphroland.org
3:24pm up 1:30, 1 user, load average: 0.00, 0.01, 0.00



Reply to: