[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: Does anyone have software RAID 1 mirroring of / set up on Sparc?



Hmm, now I'm getting a kernel panic as the system prepares to REBOOT (i.e. while Linux shuts down services and unmounts filesystems in preparation for rebooting).  This means that effectively every time we reboot the server we would have to use LOM to reset the chassis, which we really do not want.

I've noticed that although the panic occurs when kernel tries to mount root *read-only* before restarting, the data on root gets out of sync between reboots.  I can tell this because a minute or so after rebooting I see this on the console:

    md: md2: sync done.

    RAID1 conf printout:

     --- wd:2 rd:2

     disk 0, wo:0, o:1, dev:sda4

     disk 1, wo:0, o:1, dev:sdb4


Does anyone have any ideas on what might be causing this, and if there's a workaround for it?

As it seems it might be related to the migration of root to md/addition of LVM, here's what my mount table looks like:

/dev/md2 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
usbfs on /proc/bus/usb type usbfs (rw)
tmpfs on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md0 on /boot type ext3 (rw,errors=remount-ro)
/dev/mapper/volgroup-lvol0 on /var type ext3 (rw)
/dev/mapper/volgroup-lvol1 on /home type ext3 (rw)
tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)


Below is the panic output:
-------------------------

Will now deactivate swap.
swapoff on /dev/md1
Done deactivating swap.
Will now unmount local filesystems.
/dev/mapper/volgroup-lvol0 umounted
/dev/mapper/volgroup-lvol1 umounted
Could not find /dev/.static/dev in mtab
/dev/.static/dev umounted
/dev/md0 umounted
Done unmounting local filesystems.
Shutting down LVM Volume Groups...
  0 logical volume(s) in volume group "volgroup" now active
Mounting root files              \|/ ____ \|/
 ystem read-only             "@'/ .. \`@"
 ..             /_| \__/ |_\
                 \__U_/
swapper(0): Kernel bad sw trap 5 [#1]
TSTATE: 0000000811f09603 TPC: 0000000000527c08 TNPC: 0000000000527c0c Y: 0000000
0    Not tainted
TPC: <U3memcpy+0x8/0x500>
g0: 0000000000000003 g1: 00000000000000a4 g2: 00000001ffffffff g3: fffffffffffff
f5c
g4: fffff8103f48a4a0 g5: fffff8001bf98000 g6: fffff8103eff8000 g7: fffff8103eca4
018
o0: fffff8103eca4538 o1: fffff8103ec5cc18 o2: fffffffffffffae0 o3: 0000000000000
000
o4: 0000000000000010 o5: 0000000000000000 sp: fffff8103effa831 ret_pc: 000000000
047a7fc
RPC: <cache_flusharray+0x5c/0xb4>
l0: ffffffffffffff5c l1: fffff8103ec5cc00 l2: fffff8103eca2ec0 l3: fffff8103eca2
ee0
l4: 0000000000000000 l5: 0000000000000001 l6: 0729000004000000 l7: 0000000000000
000
i0: fffff8103f4aa9c0 i1: fffff8103ecbd380 i2: fffff8103eca4000 i3: 0000000000000
000
i4: fffff8103f492c18 i5: 0000000000000000 i6: fffff8103effa8f1 i7: 000000000047a
318
I7: <kmem_cache_free+0x4c/0x7c>
Caller[000000000047a318]: kmem_cache_free+0x4c/0x7c
Caller[0000000000462d74]: mempool_free+0x88/0x98
Caller[0000000000482f00]: bio_put+0x40/0x50
Caller[000000000047f9c0]: end_bio_bh_io_sync+0x50/0x64
Caller[00000000004832a0]: bio_endio+0x7c/0x8c
Caller[00000000005f2c28]: raid_end_bio_io+0x34/0xa0
Caller[00000000005f5b20]: raid1_end_write_request+0x264/0x2d4
Caller[00000000004832a0]: bio_endio+0x7c/0x8c
Caller[000000000050f088]: __end_that_request_first+0x1a8/0x4f4
Caller[00000000005b5670]: scsi_end_request+0x18/0xe0
Caller[00000000005b5908]: scsi_io_completion+0x1d0/0x474
Caller[00000000005c903c]: sd_rw_intr+0x2d8/0x304
Caller[00000000005af848]: scsi_finish_command+0xbc/0xcc
Caller[000000000050e178]: blk_done_softirq+0x78/0x9c
Caller[0000000000444b70]: __do_softirq+0x4c/0xf8
Caller[0000000000444c58]: do_softirq+0x3c/0x50
Caller[0000000000408894]: tl0_irq4+0x14/0x20
Caller[000000000040f098]: cpu_idle+0x34/0x58
Caller[000000000042483c]: do_unlock+0x12c/0x140
Caller[0000000040000000]: 0x40000000
Instruction DUMP: 01000000  8532b01f  80a0a000 <93d03005> 98100008  80a2a000  02
600123  96120009  80a2a010
Kernel panic - not syncing: Aiee, killing interrupt handler!
TSTATE: 0000008080f09606 TPC: 0000000000672fcc TNPC: 0000000000672fc0 Y: 0000000
0    Not tainted
TPC: <_write_unlock_irq+0x7c/0x124>
g0: ffffffffffffffff g1: 00000000000000ff g2: 0000000000001000 g3: 0000000000000
197
g4: fffff8103ecb9a00 g5: fffff8001bfa0000 g6: fffff8103f470000 g7: 0000000000000
080
o0: fffff8103ecbd3cc o1: fffff8103ecbd400 o2: 0000000000000000 o3: 0000000000000
000
o4: fffff8103ee75018 o5: 0000000000000000 sp: fffff8103f473331 ret_pc: 000000000
047a618
RPC: <cache_reap+0x4c/0x1d4>
l0: 0000000000000001 l1: fffff8103f4aab28 l2: fffff8103ecbd3cc l3: 0000000000000
001
l4: 0000000000000000 l5: 0000000000000001 l6: 0000000000000000 l7: 00000000ffff3
eea
i0: fffff8103f4aa9c0 i1: fffff8103ecbd380 i2: 0000000000000001 i3: 0000000000000
008
i4: 0000000000000000 i5: 0000000000200200 i6: fffff8103f4733f1 i7: 0000000000450
370
I7: <run_workqueue+0xa8/0x108>
 <0>Press Stop-A (L1-A) to return to the boot prom

________________________________________________________________________
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
security@ffastfill.com

This email has been scanned for all viruses by the FFastFill Email
Security System.
________________________________________________________________________

Reply to: