Bug#567468: md homehost (was: Bug#567468: (boot time consequences of) Linux mdadm superblock) question.
On Tue, Feb 23, 2010 at 4:10 PM, Neil Brown <neilb@suse.de> wrote:
> On Tue, 23 Feb 2010 07:27:00 +0100
> martin f krafft <madduck@madduck.net> wrote:
>
>> also sprach Neil Brown <neilb@suse.de> [2010.02.23.0330 +0100]:
>> > The problem to protect against is any consequence of rearranging
>> > devices while the host is off, including attaching devices that
>> > previously were attached to a different computer.
>>
>> How often does this happen, and how grave/dangerous are the effects?
>
> a/ no idea.
> b/ it all depends...
> It is the sort of thing that happens when something has just gone
> drastically wrong and you need to stitch things back together again as
> quickly as you can. You aren't exactly panicing, but you are probably
> hasty and don't want anything else to go wrong.
>
> If the array from the 'other' machine with the same name has very different
> content, then things could go wrong in various different ways if we
> depended on that name.
> It is true that the admin would have to by physically present and could
> presumably get a console and 'fix' things. But it would be best if they
> didn't have too. They may not even know clearly what to do to 'fix' things
> - because it always worked perfectly before, but this time when in a
> particular hurry, something strange goes wrongs. I've been there, I
> don't want to inflict it on others.
>
>>
>> > But if '/' is mounted by a name in /dev/md/, I want to be sure
>> > mdadm puts the correct array at that name no matter what other
>> > arrays might be visible.
>>
>> Of course it would be nice if this happened, but wouldn't it be
>> acceptable to assume that if someone swaps drives between machines
>> that they ought to know how to deal with the consequences, or at
>> least be ready to tae additional steps to make sure the system still
>> boots as desired?
>
> No. We cannot assume that an average sys-admin will have a deep knowledge of
> md and mdadm. Many do, many don't. But in either case the behaviour must be
> predictable.
> After all, Debian is for "when you have better things to do than fixing
> systems"
>
>>
>> Even if the wrong array appeared as /dev/md0 and was mounted as root
>> device, is there any actual problem, other than inconvenience?
>> Remember that the person who has previously swapped the drives is
>> physically in front of (or behind ;)) the machine.
>>
>> I am unconvinced. I think we should definitely switch to using
>> filesystem-UUIDs over device names, and that is the only real
>> solution to the problem, no?
>>
>
> What exactly are you unconvinced of?
> I agree completely that mounting filesystems by UUID is the right way to go.
> (I also happen to think that assembly md arrays by UUID is the right way to
> go too, but while people seem happy to put fs uuids in /etc/fstab, they seem
> less happy to put md uuids in /etc/mdadm.conf).
>
> As you say in another email:
>
>> The only issue homehost protects against, I think, is machines that
>> use /dev/md0 directly from grub.conf or fstab.
>
> That is exactly correct. If no code or config file depends on a name like
> /dev/mdX or /dev/md/foo, then you don't need to be concerned about the whole
> homehost thing.
> You can either mount by fs-uuid, or mount e.g.
> /dev/disk/by-id/md-uuid-8fd0af3f:4fbb94ea:12cc2127:f9855db5
>
>
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Would a permissible behavior be to add a third case:
If an entry is not detected in the mdadm.conf file AND the homehost is
not found to match ask on the standard console what to do with
something like a 30 second timeout; as well as being noisy in the
kernel log so the admin knows why it was slow.
Really there should probably be two questions: 1) Do you want to run
this? 2) What name do you want? (with the defaults being yes, and the
currently chosen alternate name pattern).
Reply to: