[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Replacing failed drive in software RAID



Hello,,

On Thu, Oct 31, 2013 at 05:06:33PM -0500, Stan Hoeppner wrote:
> On 10/31/2013 3:41 PM, Bob Proulx wrote:
> > Is this a BIOS boot ordering boot system booting from sda?  In which
> > case replacing sda won't have an MBR to boot from.  You can probably
> > use your BIOS boot to select a different disk to boot from.  And then
> > after having booted install grub on the other disk.  (Sometimes the
> > BIOS boot order will be quite different from the Linux kernel drive
> > ordering.)

I think it is BIOS boot ordering. I don't remember how I installed MBR. Is it
even possible to have MBR on both sda and sdb? I think I was considering that
option but can't remember.

> > I am unfamiliar with the sgdisk backup and load-backup operation.  I
> > am not sure that will restore the grub boot sector.  This isn't too
> > scary because you can always boot one of the other drives.  Or boot a
> > debian-install rescue media.  But after setting up the replacement
> > disk it will probably be necessary to install grub upon it in order
> > for it to be bootable as the first BIOS boot media.

I don't know either, but in case that boot sector is not copied, could I just
copy first 446 bytes? It is the place where MBR is located, without touching
partition table. So could something like this work:

dd if/dev/sdb of /dev/sda bs=446 count=1

This can be done from live cd after drive is replaced in case it won't boot.

> > And very often I have found that a second disk that I thought should
> > have had grub installed upon it did not and when removing sda I find
> > that the system won't grub boot from sdb.  Therefore I normally
> > restore sda, boot, install grub on sdb, then try again.  But if you
> > know ahead of time you can re-install grub on sdb and avoid the
> > possible hiccup there.  But if you are concerned about writes to sdb
> > then I would simply plan to boot from the debian-installer image in
> > rescue mode, assemble the raid, sync, then replace sdb, and repeat.
> > You can always install grub to the boot sectors after replacing the
> > suspect disks.  Hopefully this makes sense.
> 
> This is precisely why I use hardware RAID HBAs for boot disks (and most
> often for data disks as well).  The HBA's BIOS makes booting transparent
> after drive failure.  In addition you only have one array (hardware)
> instead of 3 (mdraid).  You have only 3 partitions to create instead of
> 9, these residing on top of the one array device, not used to build
> multiple software array devices.  So you have one /boot, root fs, and
> data, and only one MBR to maintain.  The RAID controller literally turns
> your 4 drives into one, unlike soft RAID.
> 
> The 4 port Adaptec is cheap, <$200 USD, and a perfect fit for 4 drives:
> http://www.adaptec.com/en-us/products/series/6e/
> http://www.newegg.com/Product/Product.aspx?Item=N82E16816103229
> 
> And because it has 128MB cache you get a small performance boost.

Indeed, hardware RAID makes life much simpler. I just contacted guy from
company we bought those drives to check if he can find some Adaptec from 6E
series. Not sure if boss is willing to buy one, but worth the try. 


> >> I was also thinking about inserting one drive and copying data from
> >> RIAD to it so I have backup if something goes wrong. Would that be
> >> right thing to do, or that would just load drives unnecessarily and
> >> accelerate their failure?
> > 
> > Are you asking about the one drive inserted being large enough to do a
> > full system backup?  If so then I think it is hard to argue against a
> > full backup.  I think I would do the full backup even with the extra
> > disk activity.  It is read, not write, and so not as bad as normal
> > read-write disk activity.
> 
> Agreed.

This is what I'm doing now. I inserted some 2TB drive to make backup, but
after booting the machine I noticed that my data are 2.3TB in size. :( Since
most of the data are rsnapshot backups, I will copy just newest one to have
something if something happens. Can't do full backup. 

One question: since most of my data are hard links, what would happen if I
just use
# cp -a /data /mnt/newdrive

Would this command copy every file more than once (every hard link as separate
file) or "-d" from "-a" argument would copy them as hard links on destination
file system?

Could this be done with midnight commander too?

> 
> > In which case you might consider that instead of replacing all disks
> > one by one that you could simply do a full backup, then create the new
> > system with lvm and raid as desired, and then restore the backup onto
> > the newly constructed partitions.  After you have the full backup then
> > your original drives would be shut off and available as a backup image
> > too in that case.  So that also seems a very safe operation.
> 
> This is my preferred method.  Cleaner, simpler.  Still not as simple as
> moving to hardware RAID though.

Problem is that I don't have another 3TB drive to do a full backup. Also, this
method requires another 4 SATA ports which I don't have and maybe PSU to
support 8 drives (this one is 500W, so maybe that's enough). 

> > Or since you have four new drives go ahead and construct a new base
> > configuration with the four new drives with lvm+raid as desired.  And
> > then clone directly from the old system disks to the new system
> > disks.  Then boot the new system disks.  This has much more offline
> > time than the replace one disk at a time that you outlined above.  I
> > normally do the sync one disk at a time since the system is online and
> > running services normally during the sync.  But there are many ways to
> > accomplish the task.
> 
> And yes there is more down time with this method.

Again, problem with SATA ports. Needed more than I have.


sdd drive just failed and now I'm running without it. Good thing is that it's
mirror pair, sdc, is the only drive Seatool didn't found faulty. At least, I
think it's the way it works. sda mirrors to sdb, sdc to sdd, and they first
pair do striping with other one. 


Thanks for your advices and time you took to review my problem. Much
appreciated.

Regards,
Veljko
 


Reply to: