[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Filesystem completely hosed

Grant Bowman <grantbow@grantbow.com> writes:

> Until we have live machines used for production server use or full-time
> developer machines used regularly (with high disk usage), the community
> may be unable to definitively track down these kinds of sporadic
> problems.

And as far as I understand, the general advice for tracking down this
kind of problem are:

1. Try to reproduce it on a different filesystem than /. If nothing
   else, that should make provoking the bug less painful.

2. Get a dump of the corrupted filesystem. Examine it, or make it
   available so some other knowledgable file system hacker can examine

3. Ideally, the filesystem should be fairly small, and you should have
   a dump of the filesystem before and after the curruption.

For example, create a reasonably large file (or partition) and use
mkfs to create an ext2 filesystem. Then iterate:

  * Make a copy of the file.

  * fsck -N the file.

  * If fsck complains, the file and the previous copy may provide some
    debugging clues. Stop.

  * start an ext2 translator on the file.

  * Perform one operation that stresses the file system (install some
    package, compile something, etc, whatever have provoked corruption
    in the past).

  * Sync the filesystem and remove the ext2 translator.

  * Start over.


Reply to: