[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#475479: If /boot is on RAID, /boot mountpoint record is lost when configuring LVM

reassign 475479 partman-base
severity 475479 important
tags 475479 confirmed
retitle 475479 If /boot is on RAID, /boot mountpoint record is lost when configuring 

On Friday 11 April 2008, Daniel Dickinson wrote:
> I realized I mistated the situation in my first report.  /boot is on a
> RAID1 device created at the same time as the RAID1 device for the LVM
> volume group.  When LVM is activated, partman loses
> the fact that a device is associated with /boot.

Reassigning to partman-base as the problem is not caused by partman-lvm, but 
rather by the way we "restart" partman after any high-level action like
configuring RAID, LVM, crypto etc.

Basically after such actions, we run all init.d scripts, including the 
30parted script. This script has 2 modes:
1) parted_server is not running
2) parted_server is running

The first mode is used when parted is first started (or restarted after 
exiting to the menu), and then 30parted will start the server (of course) 
and copy the state info in /var/lib/partman/devices (hereafter: DEVICES) to 
DEVICES.old, remove the existing DEVICES and *selectively* copy info from 
DEVICES.old back to DEVICES.
This selectively copying does not include state info for RAID, dmraid and 
multipath devices, which results in mountpoints etc being lost!

The second mode does not do this cleanup and copying and you'd think that it 
is more suitable for "restarts" after the mentioned high-level actions, BUT 
this second mode is not what we actually do!
The restarts are done by calling restart_partman() from lib/base.sh, which 
has a comment that stop_parted_server() must be called before it, which 
effectively forces 30parted into the first mode.

The reason this is probably done is to avoid bugs that could happen with 
a "statefull" restart. Bugs that may or may not have been solved by now...

In a simple test, partman seems to do the correct thing if 
stop_parted_server() is _not_ called before restart_partman(). I think this 
would be the correct solution, but will obviously need a lot more testing.

Removing the exception for RAID devices in 30parted for mode 1 did result in 
errors, basically because the double use of /dev/mdX and /dev/md/X. One 
reason is that parted_devices returns the former notation while the rest of 
partman "prefers" the latter.


Attachment: signature.asc
Description: This is a digitally signed message part.

Reply to: