Bug#558686: Partition manager fails to update kernel partition table
Hi Frans,
Thanks for your investigation into this problem. I am impressed!
On Mon, Nov 30, 2009 at 03:58:31AM +0100, Frans Pop wrote:
> On Sunday 29 November 2009, Torsten Landschoff wrote:
> What can be seen from your logs is that you're creating an LVM on RAID
> setup using manual partitioning. The error occurs during the *first* time
> partman tries to commit changes to disk.
Correct.
> I've just spent about 4 hours trying to reproduce the error in Virtualbox.
> AFAICT I've succeeded in reconstructing exactly what you did in partman
> Here's what I did to reproduce your actions.
>
> <snip>
> Starting position:
> Disk a: msdos disklabel
> - primairy ext4 partition 1
> - logical swap partition 5
> - free space at end of disk
> Disk b: no disklabel
Looks correct. I don't know exactly which setup I used for Ubuntu. I did the
partitioning manually since I did not want it to create a 1TB+ filesystem which
I expected would take quite some time.
I think, I had /dev/sda1, 20GB ext4, and /dev/sda2, 16 GB swap.
> Start partman
> Choose: Guided LVM, but Go Back immediately
Right, I wanted to check if it suggests some RAID1 setup.
> Choose: Manual
> Select disk a and create new disklabel
> Select disk b and create new disklabel
> Create xGB primairy partition on disk a (different size than existing
> partition 1)
> - use as RAID
> - delete partition
I first wanted to use the whole disk as LVM on RAID, but figured that having
/boot extra would be a good idea.
> Create xGB primairy partition on disk a
> - change mountpoint and select /boot
> - change type to ext2
> - mark bootable
> - done
> Select just created partition
> - use as RAID
> - done
... to have another boot partition on /dev/sdb.
> Create xGB primairy partition on disk b
> - change mountpoint, but Go Back immediately
> - use as RAID
> - done
> Create yGB primairy partition on disk a (leave some free space)
> - use as RAID
> - done
> Create yGB primairy partition on disk b
> - use as RAID
> - done
> Choose: Configure RAID
> - Accept to commit changes
> => for me: success; for you: error message
> </snip>
Quite close to my setup.
> I *can* reproduce the error by manually activating swap on /dev/hda5 from a
> debug shell just before starting partman, except that it complains about
> /dev/hda1 instead of /dev/sda2.
Does the installer by default use any swap partition it finds? I did not enable
swap (hardly needed with that much RAM), but wasn't sure if d-i might auto configure
it when finding a swap partition.
> If I then switch to a debug shell and do 'fdisk -l /dev/hda', I see that -
> despite the error message - the partition table *has* been changed, and if
> I check 'free' I see that swap is disabled. So AFAICT your action to write
> the partition table again from fdisk was probably redundant.
I don't think so. I checked in the shell if /proc/partitions (did not know about
fdisk -l) matched my expectations and it had the new setup for /dev/sdb but not
for /dev/sda.
> Questions:
> - did you do anything special or manually in the early part of the
> installation (before the start of partitioning)?
Nothing special. I think I dropped out of the standard sequence because I
tried to get english texts with a german keyboard. I know this is bad for
the localization but I often find myself translating back to english to be
able to understand german messages.
> - did you do anything special or manually during partitioning before
> the error occurred?
No.
> - does my reconstruction above match what you did, or was there anything
> different?
AFAIR, your reconstruction matches my steps, apart from partition sizes.
> Please think carefully: this is a subtle issue, details are essential.
>
> As already requested, please send the syslog of the installation! You can
> find it under /var/log/installer on the installed system.
Sent to the installation report to have it all in one place.
> Some wild theories:
> 1) this is a libparted bug; somehow it manages to confuse itself about the
> state of the disk (busy or not busy)
So far this is my guess. I think you showed that it is not (3). I would think
that (2) also does not apply, since fdisk immediately got the tables reloaded.
Perhaps I should try a few more times to see if this is reliable. I don't
see how (4) could apply - the ext4 on /dev/sda1 was not mounted and what else
should keep the disk busy? An MD or LVM device, sure, but nothing like that
was configured.
Greetings, Torsten
Reply to: