[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: assurance that / is read-only before fsck



Ian Jackson writes:
> Robert Sanders writes:
> > Here's what I have in my rc.S now:
> > ...
> >         mount -n -o remount,ro /
> >
> > This obviously isn't perfect, but it at least tries to force the root
> > directory to be read-only, and it's much less error prone than any
> > other mothod I've seen.  At one point I had the script checking root's
> > writability by echoing to a reserved file in /etc, but that's ugly,
> > and the above solution will correct an improperly installed kernel.
> 
> I don't think this is safe.

Well it might not be, but it SHOULD be.

> I was under the impression that if you mount a fs read-write, then
> read-only, and then read-write again, the kernel can buffer stuff
> across the remount(s) in such a way that the filesystem can get
> corrupted if you modify the fs while it is read-only.

What can happen is that blocks that were dirty before the remount to RO
won't get synced at the remount, so you've got dirty blocks hanging around.
That isn't really a problem, however; when *fsck tries to get a block
to read from or write to, it'll get the dirty block which specifies the real
state of the disk anyway.  After doing what it must to that block, the block
will be relegated to the buffer cache again until it's written out by
update(), bdflush(), or fsync().  There's no danger there, just business as
usual.

The danger comes in when the filesystem uses buffered devices and *fsck
uses raw devices.  The filesystem could keep dirty buffers in the
buffer cache, but since raw devices bypass the buffer cache, *fsck will
see an out-of-date copy when reading from disk.  *fsck will then modify
the block, write it out, and wind up accomplishing nothing, because when
the kernel flushes the corresponding buffer, it'll overwrite the block 
*fsck modified.  Linux doesn't support raw devices, and if it did, *fsck
would be safest to fsync() the buffered version of the device before fscking 
the filesystem.  Better yet, do_remount() might need to be modified to
fsync() the remounting device.

> I'd like to see someone knowledgeable about such things comment.

Was that a slight of my knowledge? :-)

> If this is true then perhaps attempting a test write to a file in /tmp
> would be a better idea.
 
Well, the thing is that a corrupted filesystem wouldn't necessarily benefit
from allocation activity.  That's why i recommended having a pre-allocated
file in /etc which gets 'touch'ed at startup.  If the 'touch' fails, you
know the fs is RW.  If it succeeds, there's not much harm it could do, as
no allocation takes place.  Note that a seriously confused filesystem 
might still be damaged.

If you're still worried about ghost buffers hanging around, my remount
suggestion could be changed to  "umount /" which is translated in-kernel to
this:

      fsync the root device
      remount the root filesystem read-only

which, raw devices or no, is safe, although less obvious.  It doesn't matter
to me, but something should be done.




Reply to: