[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Software RAID and drive failures



nate wrote:
Juhan Kundla said:


I buy two smaller (and cheaper) IDE disks and use them in RAID-1 array. I
hope that this gives me good protection against hardware failures. If one
disk fails, then other will still have my data intact, right? The main
question is, that how good is the software RAID, when one drive is not
lost completely, but it starts to have more and more bad blocks? Will the
RAID-1 protect me from data corruption in that case? Any
comments?

I've expemerimented for a few weeks with software raid (kernel 2.4 and raid code 0.9) making some systems crash and when bad blocks shown up the system stopped working on that disk.
Using mdadm you can have a mail sent when the system detects a disk failuer.

sort of.  I've only setup a couple raid-1 arrays with IDE disks that I can
think of using software raid. One was using 2x20GB IBM 75GXP disks, the
most unreliable disks available at the moment(in my experience). About

Two years ago I bought three IBM deskstar 45GB: so far two crashed ;)

One of the points with software raid is that is software: if you use IDE drives don't expect an uptime 99.999%. As an example IDE disks are not hot swap, so if one disk fails badly it will make the system stop (tried by unplugging the power source from a running disk).

So the biggest risk for data curroption is if the system crashes due
to a disk failure. I have had linux systems hobble along with a failed
disk for weeks on end(the system gets incresingly loaded and less responsive),
I've even had a system run when the root disk failed(though logins
were impossible and no new processes would start). But that's the
price you pay. You will not eliminate chances of data currpotion or
data loss, even with hardware raid. You can only reduce it to a
level where the chance of it happening is acceptable(maybe 0.0001%
chance on some higher end systems).

Agreed, but software RAID provides more protection that no RAID.
So far in my experience (four sw RAID production system for 6 months) I've had no problems, expecially no problems due to the use of sw RAID itself and this is important (al least for me). In the past I've administered 5 MS Windows clusters: I've had tons of problems for the cluster technology (hw and sw).

this is a good idea too. You can use rsync with the hard link option
to reduce the amount of space(& time) needed for storing multiple
backups of the same data on the same filesystem(haven't tried this
myself but hear its good).

Even if you choose to use tar or cpio or whatever perform incremental or (better) differential backups. A daily full backup sounds useless.

I would avoid EVMS, or probably even LVM for a real critical system
the code isn't all that great(from what I've read). Infact the LVM
stuff in 2.4.x is being totally ripped out and replaced from scratch.
If you really need the functionality I guess it wouldn't hurt too
bad but its just one more thing that could go wrong on the box.

Not tried EVMS jet. As for LVM my systems use it and it's really useful. - No need to prior plan file system size. - If you use reiserfs you can hot resize file systems (with kernel 2.6 this should be available also for ext2/ext3).
- Multi disk file system spanning.

Again I've had no problems so far.

IMHO one important thing: if you decide to use sw RAID or LVM know what you are doing! You are installing a very particular system, that will need particular care for recovery along with a dedicated rescue disk.
You'll need to practice with system recovery before going to production.
Don't do such a thing lightly or you could find yourself in big trouble when it's recovery time.

Don't want to scare you. I advise the use of sw RAID and LVM, but again, know what you are doing.

There are lot's of documents about sw RAID and LVM, two targeted at debian are here:

http://www.midhgard.it/docs/index_en.html
http://karaolides.com/computing/HOWTO/lvmraid/lvmraid.html

Best regards
Massimiliano

--

Massimiliano Ferrero
Midhgard s.r.l.
C/so Re Umberto 23
10128 - Torino
tel. +39-0112301400 - fax +39-0112301422
e-mail: m.ferrero@midhgard.it
sito web: http://www.midhgard.it



Reply to: