[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RAID Questions




On 06/07/2010 16:16, Kent West wrote:
I am a RAID newb.

My goal is to have a redundant Debian (Stable) system, such that the
second drive is a mirror of the first drive. I would think RAID1 would
be the route to go.

However, being a RAID newbie, I'm running into all sorts of problems,
not least of which that I simply don't understand some of the basic
concepts.
Well, perhaps you should read up on the basic concepts.  I'd start with:

http://en.wikipedia.org/wiki/RAID
http://en.wikipedia.org/wiki/RAID_1

Then, for Linux (software) RAID:
https://raid.wiki.kernel.org/index.php/Linux_Raid
or in a somewhat more readable (but older) form: http://tldp.org/HOWTO/Software-RAID-HOWTO.html

Then, for Debian RAID: Read the sections on RAID in the Debian installation manual.

Basic summary of steps:

1.  start installer, go through initial steps (keyboard, network, etc.)

2.  start up disk partitioner
-- create partitions - here's what I go with (but for servers) - do this for each drive (personally, I find it easer to do this with fdisk)
---- part1 boot primary Linux RAID 2G
---- part2         primary Linux RAID 3G for swap
---- part3         primary Linux RAID <rest of the space> for root
-- set up RAIDs
---- /dev/md0 - 2G - ext3 (or ext4) file system, mount point is /boot
---- /dev/md1 - 3G - swap
---- /dev/md2 - <REMAINDER OF DRIVE> - ext3(4) file system, mount point /

3. rest of the installation procedure

4. after you've rebooted and are into your newly set up system:

to make sure that you have grub installed on the MBRs of both drives (do some googling on boot +RAID1 to understand, also read up on grub)

shell> grub-install hd0
shell> grub-install hd1

and then edit /boot/grub/menu.lst:

#add these to boot off your second drive if the first fails
default     0
fallback <n> -- n depends on where you duplicate your boot clause

#duplicate your first boot clause, but change the line "root (hd0,0)" to "root (hd1,0)"

##test this - reboot, TAB to get into the boot screen, select the fallback boot, see if it works

5. One serious gotcha to watch out for, down the road: If one drive starts failing, the first symptom is often long delays in access time (the internal drive software keeps trying to read data, and if it can, it will eventually return). Unfortunately, Linux software RAID treats this as perfectly ok behavior - it will keep the drive in the array. But... your entire machine will slow to a complete crawl. Confusing as hell, until you realize what's going on. (On some laptops, a shorted battery causes the same symptom).

Some things to do:

- set up smart tools, keep an eye on the Raw_Read_Error_Rate value - if it's anything other than 0, start worrying

- install something like atop (or precisely like atop) - if it shows that that one of your drives is near 100% busy, it's probably failing

- if a drive is failing, use mdadm to "fail" the drive - things will start performing a lot better; then replace the drive

- be VERY careful during recovery, it's pretty easy to destroy the good copy of your data (I learned this the hard way, the first time I ran into this particular failure mode)

- as someone pointed out, RAID is not a substitute for backup - the first time I had to rebuild from a disk crash, despite RAID, I was glad all my user data was backed up. Either set up an external drive or subscribe to something like CrashPlan.

Miles Fidelman


--
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra



Reply to: