On Thu, Aug 20, 2020 at 01:34:58PM -0700, David Christensen wrote: > On 2020-08-20 08:32, rhkramer@gmail.com wrote: > >On Thursday, August 20, 2020 03:43:55 AM tomas@tuxteam.de wrote: > >>Contraty to the other (very valid) points, my backups are always on > >>a LUKS drive, no partition table. Rationale is, should I lose it, the > >>less visible information the better. Best if it looks like a broken > >>USB stick. No partition table looks (nearly) broken :-) > > I always use a partition table, to reduce the chance of confusing > myself. ;-) > > > >I have two questions: > > > > * I suppose that means you create the LUKS drive on, e.g., /dev/sdc rather > >than, for example, /dev/sdc<n>? (I suppose that should be easy to do.) Exactly. > > * But, I'm wondering, how much bit rot would it take to make the entire > >backup unusable, and what kind of precautions do you take (or could be taken) > >to avoid that? I have no current strategy for silent [1] bit rot. For file system consistency, I do run from time to time an fsck after opening the LUKS and before mounting (We are talking about roughly 60..70 GB; were we talking about 100..1000 times as much, active bitrot mitigation might sound more compelling). > I have been pondering bit-rot mitigation on non-checksumming filesystems. The big ones have that; and for really huge amounts of data (where some corners of your data might rest unseen for years (say hundreds of TB or so), it does make sense. In my case, I consider the backup just as something which i expect to "fail early" and "fail loudly". In the "normal" case it is perfectly disposable :-) > Some people have mentioned md RAID. tomas has mentioned LUKS. I > believe both of them add checksums to the contained contents. So, > bit-rot within a container should be caught by the container driver. Don't know about that, to be honest: I count on the ext4 beneath the LUKS to catch any nasties (and to issue an early warning when the USB stick starts degrading -- I'm still a bit queasy how cheap a 128GB USB stick can be). > In the case of md RAID, the driver should respond by fetching the > data from another drive and then dealing with the bad block(s); the > application should not see any error (?). I assume LVM RAID would > respond like md RAID (?). Yes. That's why I reserve RAID for "high availability" case: you want to keep running after a failure, and your customer doesn't notice (it would make sense to think about whether this is the best level to introduce redundancy, but I disgress). > In the case of LUKS, the driver has no > redundant data (?) and will have no choice but to report an error to > the application (?). I would guess LVM non-RAID would behave > similarly (?). Exactly. For the backup scenario, the whole backup /is/ the redundant data. If the probability of failure of your main system in some given time interval T is, say, 10e-7, and that of your backup's in the same time interval is, say 10e-5 (cheaper hardware, and that), you're looking into a catastrophe with a prob of 10e-12. If you want to better that, use two separate backup media, then you are into 10e-17 [2]. Cheers [1] silent meaning some bit flips in file content, without the file system noticing. Which on ext4 is quite possible and btrfs, e.g. can (reasonably) guard against. [2] This is, of course, "economist maths", the one which lead to the 2008-2009 crash: assume all those bad events are independent. If my house burns down, my computer is in there, and my only backup on a stick is in my pocket... - t > > > For all three -- md, LUKS, LVM -- I don't know what happens for bit > rot outside the container (e.g. in the container metadata). > > > David >
Attachment:
signature.asc
Description: Digital signature