[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Filsystemkorruption i ext4?



Hi Jesper,

RAID 1 is mirroring. I suppose, a reason for the failure might be a timing 
problem. I do not know for sure, if yous system has got a real RAID-controller 
or if it is made by software.

The real controller should not produce write errors, however maybe at heavy 
load it might happen. I never used RAID 1 myself, as I am a fancy guy and am 
no friend of RAID 1. It is just, when there is an error on one drive, it is on 
the other, too.

My fancy solution was, using one drive and mirror this frequently every 30 
minutes using rsync. IMO doing so, I have several options:

1. If the harddrive is defective, I can boot the other one.

2. If the software is defective, I have 30 minutes, to discover the failure 
(every good logging system should alarm this in time)

3. I have a running backup available.

4. I can exchange the defective harddrive during the running system.

5. After exchange, i can examine, what happened (hardware failure, malware, 
whatever).

Many people will now laugh at me, but doing so, worked for me at best. So I 
reached an uptime of more than 700 days, but this might not be based on my 
work, but the work of all the debian developers!

As I said before, i am not very experienced with RAID 1, other people might 
know much more.

Personally I believe, RAID is mostly used with Windows, as Windows does not 
have these nice tools like rsync or syslog and all the things, that make linux 
and debian so great.

Have a nice eastern!

Best

Hans 

> Sorry - I should have left more of the previous mails quoted.  I have
> previously tested the RAID1 consistency (ok), fixed the file system
> (found 3 files with incorrect block count), and now also tested the
> RAM.And since it seems unlikely that it is a bug in ext4 (in Debian
> Bullseye), I don't quite understand how such an inconsistency can occur.
> Thanks for your response, Jesper
> 
> > If so, I suggest to boot a live system like Knoppix or similar, then run
> > your test by using
> > 
> > e2fsck -y /dev/sda1
> > 
> > or wherever your filesystem resides.
> > 
> > Please pay attention: If you have encrypted filesystems, then first open
> > the encryption, do NOT mount the filesystem and then check it, for
> > example:
> > 
> > cryptsetup luksOpen /dev/sda1 data1
> > 
> > then enter the password and now you can run
> > 
> > e2fsck -y /dev/mapper/data1
> > 
> > Note: the word "data1" is only an example, you can name it, whatever you
> > want like "space", "soap", "bullet", "henry" or whatever.
> > 
> > Hope this helps.
> > 
> > Best
> > 
> > Hans
> > 
> >> [Sorry - I accidentally sent this too quickly in an incomplete state.
> >> Second try here:]
> >> 
> >>> On Wed, Mar 20, 2024, 11:28 AM Jesper Dybdal
> >>> 
> >>> <jd-debian-user@dybdal.dk>  wrote:
> >>>      I think I'll let memtest86+ run overnight one of the coming nights.
> >>>      
> >>>      Unless it is simply a RAM error, then it is a bit scary...
> >> 
> >> I've now let memtest86+ run for 9 hours, during which it did 14 passes
> >> of all its tests.  It found nothing wrong.
> >> 
> >> On 2024-03-20 22:58, Nicholas Geovanis wrote:
> >>> I have seen that a couple times, unlikely but possible. Maybe review
> >>> your RAM configuration too, ensure that the sticks are on the same
> >>> supported refresh rate and distributed across the slots in an approved
> >>> way.
> >> 
> >> There is only one RAM stick (of 16 GB), so there should be no problems
> >> of that kind.
> >> 
> >> I'm afraid I won't find an explanation of that file system corruption :-(
> >> 
> >> Thanks to Franco and Nicholas for your responses,
> >> Jesper





Reply to: