[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: reiserfs/md1/failure/threads



On Tue, Jul 18, 2006 at 12:01:31PM +0100, Jo Shields wrote:
> Mickael Marchand wrote:
> >check your memory (yes it's going to be long, but that's almost always
> >the reason of reiserfs failures)
> >I am stressing hard reiserfs on various amd64/em64t boxes, no pbl so
> >far.
> >every box I found corrupting filesystems were having :
> >1 - bad hard drives that a low scan confirmed
> >or 2 - bad memory that a real long memtest could detect
> >
> >Cheers,
> >Mik
> >  
> 
> I'll add to this - I've seen corruption with all filesystems on my 
> office desktop (which has screwed memory, but they refuse to fix it). 
> EXT3 gave up on journalling & just started writing junk, costing me my 
> /home.

Ext2/ext3 complains about errors, but you normally don't see that
because it's hidden in the system log files. It's a good thing to mount
partitions with the "errors=remount-ro" option. If anything goes wrong,
the kernel will mount the partition read-only. Reboot+fsck will save
your data.

> Reiser is lasting up better, but reiserfsck segfaults when it 
> sees /home

That means that the filesystem has errors. Reiserfsck is able to detect
them, but because nobody has seen those errors before it will segfault
on them. That also means that the reiserfs filesystem driver in the
kernel will happily screw the filesystem further up without notice.
Back up your data *NOW* before it's too late.

Reiserfs is much more vulnerable to disk errors cause it lacks
redundancy:
- There is only one superblock. If you loose it (bad block, for
  example) you *could* repair it with reiserfsck, but for that you need
  to know the hash type, which depends on the fileystem version, and
  the only place where that is documented is the superblock. If you
  guess it wrong, your data will be lost.
- File connectivity is represented by a btree (b+tree, IIRC). If you
  loose some of the nodes high up in the tree, you can recover files,
  but where they belong in the tree is everybody's guess.

Another interesting way to screw up reiserfs is to have an image of
another reiserfs on a reiserfs partition. Reiserfsck will happily link
the contents of that image into the containing partition damaging the
partition beyond repair.

Due to its traditional Unix disk layout, ext[23] doesn't have the
problems reiserfs has. And ext2/ext3 is the only filesystem that has a
regression test suite for its fsck, so errors in earlier e2fsck
revision will not pop up again in later revisions.

Sure, reiser4 will solve some of the problems I mentioned above, but it
doesn't look like it will go into the kernel real soon now. See
http://wiki.kernelnewbies.org/WhyReiser4IsNotIn , or tune in to the
linux-kernel mailing list for the current reiser4 flamewar.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands



Reply to: