[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: FILESYSTEM CORRUPTION



Vincent Renardias <vincent@waw.com> writes:

> (speaking a 'mount' maintainer)
> 
> I agree crash disks aren't fun at all, however from this email and from
> your previous bug report, I fail to see where 'mount' is involved in this
> infortunate process: 

Thanks for your reply, Vincent....  It would have been appreciated had 
you replied last month however :-)

> 1/ the kernel still doesn't support forced umounts, so doesn't umount
> consequently (& unfortunatly); although umount has preliminary '--force'
> support (just try umount -f /something), it won't work until the
> kernel-side is ready.
> When you have run-away or zombies processes with open file descriptors,
> it's the kernel that prevents the unmounting.

In this case, it is a kernel bug (and a serious one).  However, note
that processes with open FDs are not the only problem.  (Although they
can be, I have experienced cases where things like ls hang waiting for
an NFS that cannot be fulfilled.  kill -9 will not kill them, and they
will prevent it from being umounted later.  Another kernel bug.)  In
my original bug report, I complained that if it tries to unmount a
NFS-mounted partition, and cannot for whatever reason (server crashed,
unreachable, etc.) it will hang and not umount local drives.
Therefore, I believe it would be prudent, as a temporary workaround
for the kernel bug, to umount all local drives before umounting
network drives.  It is generally not a big deal if a network drive
doesn't get umounted anyway.

> 2/ when rebooting with an unclean filesystem, the '/' is mounted r/o
> so e2fsck can be loaded to check all the filesystems BEFORE mount
> attempts to mount them r/w.

Yes.  Not quite sure what this has to do with the bug report, but oh
well :-)

My root FS was hosed so bad that the kernel couldn't find init and
paniced.

> The problem as you say involves PCMCIA (which fails to unload), you (for
> turning off the machine) and the kernel (for panicing), but why would
> mount be involved?

There is also a separate PCMCIA issue.  Unfortunately, I did not
realize that the corruption occured until almost 24 hours later (I
thought nothing of it at the time) and so I do not have details on the 
PCMCIA problem and I feel leery of reporting a bug against PCMCIA
without information to back it up.  I hope, however, that the PCMCIA
maintainer will see these messages and at least be on the lookout for
future occurances of these problems.

I shall also make sure to get in-depth information when/if this occurs 
again (as far as PCMCIA is concerned).  You are correct in saying that 
this latest problem is not mount's fault (I believe).  The end result
is the same as the mount/kernel issue -- filesystems do not get
umounted and corruption can result.

I think we have a problem with ordering...

PCMCIA services are turned off before the network is turned off,
apparently.  This causes difficulty, for instance, if the laptop is
acting as an NFS server, as mine does.  (Great way to carry data
around!)  If I forget to umount it from the NFS client (my desktop
machine), I can have two problems:

 * When I shutdown my desktop, it will hang trying to umount.

 * When I shutdown my laptop, it will hang trying to turn off PCMCIA.

grr.

Deadlock. :-(

Regards,
John "Install it again" Goerzen

-- 
John Goerzen             | Developing for Debian GNU/Linux (www.debian.org)
Custom Programming       | Debian GNU/Linux is a free replacement for
jgoerzen@complete.org    | DOS/Windows -- check it out at www.debian.org.
-------------------------+----------------------------------------------
uuencode - < /vmlinux | mail -s "Windows NT security fix" bgates@microsoft.com


--
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: