[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Problems with software RAID on SATA



Quoting Stephen Tait <tait@digitallaw.co.uk>:

I'm just in the process of setting up a Sarge server to be used as a sort of backup server. The main PATA discs are used to boot the OS offof software RAID1, with the rest of the disc space used in JBOD for not-so-important backups. However, I'm having problems getting the new disc array up and running.

We've put a SATA controller in the box, a cheap-as-chips PCI Adaptec 1210SA which, according to lspci, uses the SIlicon Image SI3112 chipset to provide two SATA channels. Connected to this are two 320GB drives which I want to turn into a RAID1 array. When the system booted first, I used mdadm to create the RAID1 array md2 (mdadm --create /dev/md2 --level=1 --raid-disks=2 /dev/sda1 /dev/sdb1), checked /proc/mdstat to wait for the array to finish syncing, and then formatted it ext3 and mounted it. Everything seemed to work fine until I rebooted, whereupon the mount failed with the report that it wasn't a valid ext[2|3] superblock; fsck confirmed this and on further inspection it seemed that it wasn't a RAID device any more either.

...and booted with that instead after editing GRUB's menu.lst. The exact same error occurred, and I'm now at a bit of a loss to explain what's happening. If I try and mount the discs on their own (i.e. mount /dev/sdX /mnt/somedir) then they work just fine, so the hardware works fine - so I'm almost certain it's a problem with initting the RAID arrays at boot. At the moment I'm just rebuilding the array to see what happens when I don't try and mount it at boot, but only after the OS has finished booting, but of course that'll only be a temporary workaround. If it's any help, here are my fstab and mdadm.conf's:

pika@zaphod2:~$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
/dev/md1        /               ext3    defaults,errors=remount-ro 0       1
/dev/md0        /boot           ext2    defaults        0       2
/dev/hdb9       /home           ext3    defaults        0       2
/dev/hdb4       /mnt/avj-backup ext3    defaults        0       2
/dev/hda7       /mnt/dcj-backup ext3    defaults        0       2
/dev/hdb8       /tmp            ext3    defaults        0       2
/dev/md4        /usr            ext3    defaults        0       2
/dev/md3        /var            ext3    defaults        0       2
/dev/hdb7       none            swap    sw              0       0
/dev/hdc        /media/cdrom0   iso9660 ro,user,noauto  0       0
#/dev/md2       /mnt/dcj-archive        ext3    defaults        0       2

===============================================

pika@zaphod2:~$ cat /etc/mdadm/mdadm.conf
DEVICE partitions
ARRAY /dev/md4 level=raid1 num-devices=2 UUID=b8093124:a6d6f876:a29eecb7:e1b332f3
   devices=/dev/hda6,/dev/hdb6
ARRAY /dev/md3 level=raid1 num-devices=2 UUID=1973b0c3:e38869d2:ffef0cde:92048042
   devices=/dev/hda5,/dev/hdb5
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=78a3be5a:f0838fe2:4d4ce7ed:3a969954
   devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=51d55d28:3e653dce:631dd682:8dd52a37
   devices=/dev/hda2,/dev/hdb2
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=56e09876:a751356e:b86535d0:95091b5b
   devices=/dev/hda1,/dev/hdb1

As you can see, most of the important directories are mounted in software RAID1 on the two PATA discs with unimportant stuff on JBOD, although of course this shouldn't make any difference. All the usual dmesg etc. stuff doesn't seem to tell me anything I don't already know. If anyone has experienced this before or has any pointers as to how I can troubleshoot it, I'd be much obliged!

I have had some trouble getting a raid array to inialize on boot in the past.
My fix, was to remove its entry from the mdadm.conf file, and re-cfdisk the disks with the auto-detect-raid setting. Then create the raid array and reboot, it came up just fine.
Other than that, I'm not sure that else could be wrong.
Hopefully someone else on the list has some better ideas.

Cheers,
Mike



Reply to: