[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sarge freezes after failure of raid disk, incurring fs corruption on unrelated disk



Douglas Allan Tutty wrote:
> Personally, I run something like samhain, not so much to check for
> intrusion as to monitor data integrity.
> 
> I wonder if the failed /dev/hdb took out the controller (ide0) and so /dev/hda
> got corrupted.

No idea how to figure that out.

> Its too bad that your system (as opposed to data) wasn't also protected
> by raid1.
> 
> If it were me and I had solid backups and could afford the downtime to
> reinstall, I'd reinstall.  Etch.  LVM on Raid1 for the system at least.
> And I'd only have one drive on each controller channel.  E.g. no hdb or
> hdd unless its for CD/DVD or something.

Ironically, I have already some new hardware with higher capacity that
will provide that functionality. I am about to migrate the system to the
new hardware as soon as I have some time (and upgrade to etch in the
process).

> Before a total reinstall, I'd really stress-test the ide0 controller to
> ensure that it wasn't damaged.
> 
> Then again, I'm paranoid.  I've only had one drive failure (bearing
> seize, hard head crash).  Pulled the plug within 10 seconds.  No other
> damage.

Maybe I should have pulled the plug. But then it might have been more
difficult to spot the failed drive.

> I really hope you have good backups.

I do for all the data. I do have backups of /etc/ and of package
selection information, so a reinstall shouldn't require too much down time.

Thanks for your advice.

Johannes



Reply to: