[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: mdadm and fsck



18/09/2011 05:04, Joey L wrote:
> Thanks for the info and quick reply.
> 
> My /dev/md0 has rebuilt itself and is okay.
> My /dev/md1 is gone after a reboot - and is no longer showing -
> I get nothing when i run:
> 
> root@rider:~# mdadm --detail /dev/md1
> mdadm: cannot open /dev/md1: No such file or directory

You could try to examine each members of the array, see if meta-data are
found or if everything vanished. You can use
"mdadm --examine /dev/sdd1"

/dev/md1 not neing found doesn't mean everything is lost, just means it
hasn't been detected or assembled at boot time, so udev didn't create
the device /dev/md? for it. But maybe the array can be brought back (at
least in degraded mode) manually. You are using 1.2 metadata on your
arrays, is you mdadm.conf info up to date ?

> 
> 
> Questions for you:
> 1. How do I figure out which drive that made up /dev/md1  DRIVES:
> /dev/sdc or /dev/sdc1 and /dev/sdd or /dev/sdd1  is okay to be the
> master where I would rebuild the second ??

There is no "master" in a raid1 array, otherwise data wouldn't be truly
redundant. A drive is becoming a temporary "gold master" when it is
alone in the array (the other one is failed). So regarding your "md1"
array sdc1 is the last known working member, this is what you should use
to try to start and rebuild the array.

When you "--examine" a member of an array, you'll be shown the array
UUID it belongs to, and the device UUID (if meta-data are not gone).
Personally after creating a raid I always take good note of those
informations (outside of the array) in case I need to do "manual"
intervention. If you need to boot from a live-cd to recover a damaged
system there is no warranty that the array names or array members names
will be consistent with what it was on the working system. But the
UUID's will.

And don't mix up /dev/sd? (the physical disk device) with /dev/sd?? (the
partition which is included in the array). There could be only one
partition on the disk, but not necessarily, or the raid could be a
partitioned on. You have no business using the physical device level to
troubleshoot a raid array, only if you want to image a damaged device
for later forensic work you'll need to work at disk level.
Don't take my word for it, try to:
mdadm --examine /dev/sdc
You will see no mdadm raid meta-data on it.

> 
> 2. How do I "zero superblocks" a drive ?? I know how to fail, remove and
> add to a raid device.

mdadm --zero-superblock /dev/sdc1

You have to fail and remove the drive first. You must do that when you
break an array and want to later reuse the members in a new array. You
should carefully read man mdadm. Unfortunately the raid wiki seems to be
down at the moment due to kernel.org problems, hope it's back online soon.
https://www.raid.wiki.kernel.org/index.php/Linux_Raid

> 
> 3. Should I use /dev/sdd1 as the master ?? the system marked it as a
> spare before.

/dev/sdc1 is the last known working member of the array, so I would
start with it. But you can try to unplug one or the other device to
prevent any further damage to it, and try to start the array in degraded
mode from either sdd1 or sdc1, you'll see what works, and what data you
can find on it.
Be aware that if you restart the array and it is rebuilt from one
member, the data on the other will be effectively wiped out. The
"--assume-clean" flag could prevent that, but then you'll have no
warranty that the data are consistent on the array. Work on a degraded
array with only one member for now is my advice.
You can try to force the assembling of a degraded or "dirty" array at
boot time with the boot option "md-mod.start_dirty_degraded=1" (see
"modinfo md_mod"), but don't do that with the two members plugged in, it
could further damaged it and compromise your chances to recover from one
of the array member. It is by far a better practice for recovery to boot
without assembling the array, then manually assemble and run it as needed.

> thanks
> mjh
> 
> 
> 
> 

Hope it helps.


Reply to: