[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#768897: debian-installer: manual partitioning with LVM destroys all non-target LVM+LUKS+GPT volumes

Package: debian-installer
Version: (from Debian 7.6.0 amd64 DVD 1)
Severity: critical
Justification: causes serious data loss

As with some other d-i reports, the Version is set from the ISO image,
as I'm not sure how to get the d-i version proper from that.

My current best test case for this:

  (blkid lines are split for readability.)

  1. The starting conditions: I've pared them down to two small disks,
     tested in a QEMU+KVM virtual machine.  Both have GPT partition
     tables, per [[STARTING-TABLES]] below.  Take note of disk 2,
     partition 3, which is marked with an LVM type code (applied via
     setting 8e00 in gdisk) but in fact contains a LUKS volume which
     contains an LVM PV.

     (gdisk doesn't have an obvious type code for "Linux LUKS volume",
     and a straw poll of another Linux sysadmin says they do the same
     thing I do and use the underlying type.  Maybe "Linux reserved"
     would be more accurate?)

     host# losetup --show --find disk2
     host# kpartx -a /dev/loop0
     host# blkid /dev/mapper/loop0p3
     PARTLABEL="Linux LVM"

  2. Boot the Debian 7.6.0 amd64 installer:

     host% /sbin/blkid /dev/cdrom
     LABEL="Debian 7.6.0 amd64 1"
     host% qemu-system-x86_64 -enable-kvm \
         -cdrom /dev/cdrom -boot d \
	 -hda disk1 -hda disk2 -m 4096 -monitor stdio

  3. Choose all defaults (and meaningless usernames, etc.) up until
     the partitioning stage.  Then choose "Manual" partitioning.

     At this point it will be apparent to the particularly alert
     viewer that both sda3 and sdb3 are shown with a "K" and an "lvm"
     marker.  (I'm not sure what the "K" means; maybe it's meant to
     represent a flaming skull?)

     From the second console:
     virt# blkid /dev/sdb3 # (line split)

     ... so blkid from the d-i environment at this stage at least
     recognizes that there is a typed volume on that block device
     (this seems to be true earlier on as well).

  4a. Choose sda2, and set it to Ext4, mount point /boot, then "Done"
      to return to "Partition disks".
  4b. Choose sda3, and set it to "physical volume for encryption",
      "Erase data: no", then "Done" to return to "Partition disks".
  5. Choose "Configure encrypted volumes".  A dialog about which
     changes will be made to the disks appears.  The main listings

     | The partition tables of the following devices are changed:
     |   SCSI1 (0,0,0) (sda)
     | The following partitions are going to be formatted:
     |   partition #2 of SCSI1 (0,0,0) (sda) as ext4

     Choose "Yes" to write changes to disks.

  6. Choose "Create encrypted volumes", then select /dev/sda3 only.
     "Continue", then "Finish".  Enter an arbitrary passphrase.
     Some progress bars appear, then "Partition disks" again.

  7. Choose sda3_crypt part 1, and set it to "physical volume for LVM",
     then "Done" to return to "Partition disks".

  8. Choose "Configure the Logical Volume Manager".  The alert viewer
     will notice that there are more "Free Physical Volumes" than
     there should be.

     At this point sdb3 has _already_ been reinitialized as a new PV,
     destroying its LUKS header and rendering it unrecoverable except
     by restoring from backup:
     host# blkid /dev/mapper/loop0p3 # (line split)
     PARTLABEL="Linux LVM"
     host# cryptsetup luksOpen /dev/mapper/loop0p3 DITest_pv
     Device /dev/mapper/loop0p3 is not a valid LUKS device.

Note especially that none of the warning screens that normally appear
to confirm which partitions will have their data destroyed refer to
sdb3 at all, and this occurs regardless of whether I create any VGs
incorporating sdb3 as a PV.  Continuing with the installation in this
vein, if I:

  - create a single VG using only sda3_crypt,

    + for which the selection dialog for PVs displays _all_ available
      block devices, not merely those marked for LVM use per se---so
      the idea that sdb3 is now a PV is not made obvious by that means
      (but this would be too late anyway)

  - then a single LV on it with an ext4 root partition

then a warning appears about overwriting data on the VG and the LV,
but makes no reference to the physical partitions.

If I go for "Configure the Logical Volume Manager" _first_, there is
some kind of warning about not being able to change the partition
tables of the disks on which PVs will be placed later, which I didn't
investigate further, but this doesn't appear in the above sequence.

The original configuration was somewhat more complicated than this
test case, which also made it harder to see the non-target disks as
most of them were off the screen.  I also used EFI boot into an
LXDE-variant expert install then; I don't think that matters here.

Outcome: all attached LVM+LUKS+GPT volumes were destroyed.  :-( :-(

Expected outcome: "Manual" partitioning mode should only ever
overwrite data on volumes specifically designated by the user.
Additionally, I would normally expect that:
  - Partitions with _existing_ LVM type codes but no recognizable PV
    header should not be presumed to be uninitialized PVs without
    asking the user.  (What if a future LVM release creates a new,
    incompatible PV type, even, and the user wants to incorporate
    the existing volume?)

    + ... and _certainly_ not if they have a header recognizable by
      blkid, which might apply more generally too.

  - The warning screen used for writing new partition tables and
    filesystems should also appear before physically initializing LVM
    PVs, LUKS, etc., as that would be the clearest for the user to
    know "which data might I be about to vaporize" and have the option
    to back out.

I rechecked the Installation Guide and Release Notes and I didn't see
anything about this specifically, but I'd sure appreciate a pointer if
I just missed it somehow.

Unrelatedly, I was actually planning on unplugging all the non-target
disks first as a precautionary measure, but then I forgot to and
didn't think anything further of it until the cold chill of cryptsetup
failing when I tried to read anything from them.

Now I am sad and have filesystems to reconstruct.  I had backups of
the more important unreplaceable stuff, but some of the configuration
will be a major pain.  :-(

(One might say the real lesson is "never install with insufficient
sleep and insufficient tea", but anyway.)

I'll upload the test disk images shortly.

   ---> Drake Wilson

Additional data:


  host% /sbin/gdisk -l disk1
  Found valid GPT with protective MBR; using GPT.
  Disk disk1: 8388608 sectors, 4.0 GiB
  Logical sector size: 512 bytes
  Disk identifier (GUID): 958553DA-8C9D-45F8-81D0-F061581778D1
  Partition table holds up to 128 entries
  First usable sector is 34, last usable sector is 8388574
  Partitions will be aligned on 2048-sector boundaries
  Total free space is 2014 sectors (1007.0 KiB)
  Number  Start (sector)    End (sector)  Size       Code  Name
     1            2048          526335   256.0 MiB   EF00  EFI System
     2          526336         2099199   768.0 MiB   8300  Linux filesystem
     3         2099200         8388574   3.0 GiB     8E00  Linux LVM

  host% /sbin/gdisk -l disk2
  Found valid GPT with protective MBR; using GPT.
  Disk disk2: 8388608 sectors, 4.0 GiB
  Logical sector size: 512 bytes
  Disk identifier (GUID): 4B88191F-49C6-4DA5-B126-5C09CA409E8B
  Partition table holds up to 128 entries
  First usable sector is 34, last usable sector is 8388574
  Partitions will be aligned on 2048-sector boundaries
  Total free space is 2014 sectors (1007.0 KiB)
  Number  Start (sector)    End (sector)  Size       Code  Name
     3            2048         8388574   4.0 GiB     8E00  Linux LVM

-- System Information:
Debian Release: jessie/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16-3-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Reply to: