[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: question about fstab in squeeze and uuid



Stephen Powell wrote:
On Sat, 13 Mar 2010 00:26:58 -0500 (EST), Paul E Condon wrote:
A bit worrisome to me. UUID must be persistent during normal life of a
device, so it can be used as an identifier.

It is important to distinguish between a device and a partition.
/dev/hda is a device.  /dev/hda1 is a partition.  Partitions
can be created, deleted, moved, resized, reformatted, etc. many
times during the life of its containing device.  The UUID of a
partition is assigned when the partition is formatted, either with
mkfs or mkswap.  It retains this value until it is formatted again,
at which time a new UUID is calculated.  I don't know what the
algorithm is for computing a UUID for a hard disk partition.
Of course, reformatting a partition destroys all data on it; so in
that sense it starts a new life with a new identity.

Yes. Actually, there's even more to consider. GPTs are the future, and will most probably replace PC/BIOS "DOS" partition tables as soon as it will be common for the average system to use >2TiB drives (finally). As its name implies, it introduces GUIDs (global, universal - same thing) for *devices* and *partitions* (and partition types as well as many other necessary things, but that's unrelated). Currently, there is no UUID for PC/BIOS devices or partitions, not even labels.


On top of partitions, there is sometimes a logical volume manager, which will split the partition (or the whole device if not partitioned) at a higher level[1] and/or "aggregate" partitions/devices[2], effectively abstracting the physical boundaries. Physical volumes have another UUID assigned in their LVM superblock.

<OT>
[1] For extra features like flexible resizing by fragmentation, etc.
[2] To provide redundancy and/or extra performance, or simply to be able to move the physical blocks (often "extents") from a location to another (live) without affecting the logical volume as seen by the filesystem (makes the replacement of a device trivial).

LVMs do *not* manage physical volumes (whole devices or partitions), only logical volumes on top of them (they just *use* the PVs). Note that raid managers can be considered as LVMs; they provide special features like redundancy or performance at the expense of some flexibility (every physical volume has to have the same size). It's possible to partition raid volumes (rare) or stack up another (more flexible) logical manager on top of them (like LVM2).
</OT>

Every logical volume created on top of these physical volumes also have a UUID, also managed by the LVM. Sometimes there are even two layers involved, for example a LV on top of a raid array which itself serves as a base for more flexible LVs, as I just explained. Each layer have its own superblock containing, among tech-specific metadata, its own UUID.

On top of these logical volumes, we can finally create filesystems. Filesystems also have a UUID in their own superblock, and that's the one we're talking about right now.

<OT>
This is *not* a mess, this is the way it should be; every layer is independent, although filesystems typically fill the entire underlying logical volume and are thus more tied to them (in terms of size only).
</OT>

So, filesystem UUIDs should NOT be persistent during the life of a device or partition or physical volume or logical volume; only during their own lifetime. As long as you don't wipe their superblock, nothing will happen to their UUIDs - you can shrink and grow them, whatever. If you really need to reformat them, you can still restore the UUID - there's no black magic involved, just numbers logically identifying an object.

<OT>
Logical volume managers are currently very tied to the operating system; no real standard solution exists (except for raid containers), and I'm not sure if any is needed, although it would allow for pretty cool things. dm/md raid[1] and/or LVM2 are the preferred implementations for Linux. Many operating systems (Linux included, Windows excluded, of course) represent each layer as a virtual device, which makes things really flexible (you can stack things up any way you want, although many setups obviously won't make any sense).

To wrap it all up, a quite complete stack looks like this on Linux:

  filesystem (extfs, [very long list])
  logical volume (LVM2 LV)
  logical volume (Device Mapper [dm] RAID vol, Multi-Disk [md] RAID vol)
  physical volume (PC/BIOS "DOS" partition, GUID partition, ...)
  device (hard disk drive, solid-state drive)

For added confusion, add a loop device anywhere, maybe a little virtual filesystem on top and a slice of networking. Note that software like EVMS supposedly helps the management of these stacks, or at least part of them.

[1] One is for "fake-raid" controller management, the other is pure software.
</OT>


- The current situation:

As I already said, until we throw PC/BIOS partition tables away, there is no standard way to uniquely identify a device. We can only rely on either their path/location (port), serial number* (not always provided, also depends on the interface) or contents (partition(s) or a unique filesystem).

*not the same thing as vendor/model ID, which is not unique.

Since we currently can't identify partitions either, we can only rely on their order (partition number) or contents (filesystem).

Filesystems can be identified by their location (partition/device), UUID, label or contents.

<debate>
Note that identifying by contents here means file analysis - yeah. Labels are useful but not always reliable in certain cases:

* Considering you only rely on the filesystem label to determine your root or boot filesystem, what happens if you boot with a removable device plugged in containing a filesystem with a label conflicting with another filesystem label on the device you actually want to boot from? Possible pwnage, I guess.)

UUIDs for filesystems were introduced partly to answer this "problem". This is the only three-nines reliable way. Labels are just there for convenience (and they are very useful at that), and as the hardware layer is beeing more and more abstracted, identifying by path or basically any underlying layer is less and less feasible. That's just a fact: filesystems can be completely detached from the devices.

In the future, we should be able to reliably specify on which unique device/partition/LV the filesystem is supposed to be, for extra fun (and more).

In fact, the future could be now, I'm just talking about general trends; GPTs can be setup very cleanly even on non-EFI hardware - except on Windows* and for no apparent reason, of course (sigh).

<OT>
*since GPTs are backward compatible, it's possible to setup a hybrid configuration and thus still be able to dual-boot it. It's kludgy.
</OT>
</debate>


- The problem:

?


Sorry for the many off-topic blocks, I felt like it could clarify my words, putting them in some context. Hopefully it's informative for someone. Please correct me if you don't like my understanding of these things; but really I see no problem.

-thib


Reply to: