Re: mounting LVM partitions fails after etch upgrade

On 5/6/07, Douglas Allan Tutty <dtutty@porchlight.ca> wrote:

On Sun, May 06, 2007 at 03:25:02PM +0200, David Fuchs wrote:
> I have just upgraded my sarge system to etch, following exactly the upgrade
> instructions at http://www.us.debian.org/releases/etch/i386/release-notes/ .
>
> now my system does not boot correctly anymore... I'm using RAID1 with two
> disks, / is on md0 and all other mounts (/home/, /var, /usr etc) are on md1
> using LVM.
>
> the first problem is that during boot, only md0 gets started. I can get
> around this by specifying break=mount on the kernel boot line and manually
> starting md1, but where need I change what so that md1 gets started at this
> point as well?
>
> after manually starting md1 and continuing to boot, I get errors like
>
> Inode 184326 has illegal block(s)
> /var: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY (i.e. without the -a or -o
> options)
>
> ... same for all other partitions on that volume group
>
> fsck died with exit status 4
> A log is being saved in /var/log/fsck/checkfs if that location is
> writable.(it is not)
>
> at this point I get dropped to a maintenance shell. when I select to
> continue the boot process:

What happens if instead of forcing a boot you do what it says: run fsck
without the -a or -o options?

>
> EXT3-fs warning: mounting fs with errors. running e2fsck is recommended
> EXT3 FS on dm-4, internal journal
> EXT3-FS: mounted filesystem with ordered data mode.
> ... same for all mounts (same for dm-3, dm-2, dm-1, dm-0)
>
> EXT3-fs error (device dm-1) in ext3_reserve_inode_write: Journal has aborted
> EXT3-fs error (device dm-1) in ext3_orphan)write: Journal has aborted
> EXT3-fs error (device dm-1) in ext3_orphan_del: Journal has aborted
> EXT3-fs error (device dm-1) in ext3_truncate_write: Journal has aborted
> ext3_abort called.
> EXT3-fs error (device dm-1): ext3_journal)_start_sb: Detected aborte
> djournal
> Remounting filesystem read-only
>
> and finally I get tons of these:
>
> dm-0: rw-9, want=6447188432, limit=10485760
> attempt to access beyond end of device
>
> the system then stops for a long time (~5 minutes) at "starting systlog
> service" but eventually the login prompt comes up, and I can log in, see all
> my data, and even (to my surprise) write to the partitions on md1...
>
...which probably corrupts the fs even more.

> what the hell is going on here? thanks a lot in advance for any help!
>
What is going on is that you started with a simple booting error that
has propogated into filesystem errors.  Those errors are compounded by
forcing a mount of a filesystem with errors .  Remember that the system
that starts  LVM and raid itself exists on the disks....

What you need is a shell with the root fs either totally unmounted or
mounted ro.  Does booting single-user work?  What about telling the
kernel init=/bin/sh? From there, you can check the status of the mds
with:

        #/sbin/mdadm -D /dev/md0
        #/sbin/mdadm -D /dev/md1
        ...

check the status of the logical volumes:
        #/sbin/lvdisplay [lvname]

and then check the filesystems with:

        #/sbin/e2fsck -f -c -c  /dev/...

Only once you get the filesystems fully functional should you attempt to
boot further.

Doug.

--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org