[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: HDD long-term data storage with ensured integrity



On 5/3/24 04:26, Marc SCHAEFER wrote:
On Mon, Apr 08, 2024 at 10:04:01PM +0200, Marc SCHAEFER wrote:
For off-site long-term offline archiving, no, I am not using RAID.

Now, as I had to think a bit about ONLINE integrity, I found this
comparison:

https://github.com/t13a/dm-integrity-benchmarks

Contenders are btrfs, zfs, and notably ext4+dm-integrity+dm-raid

I tend to have a biais favoring UNIX layered solutions against
"all-into-one" solutions, and it seems that performance-wise,
it's also quite good.

I wrote this script to convince myself of auto-correction
of the ext4+dm-integrity+dm-raid layered approach.


Thank you for devising a benchmark and posting some data.  :-)


FreeBSD also offers a layered solution.  From the top down:

* UFS2 file system, which supports snapshots (requires partitions with soft updates enabled).

* gpart(8) for partitions (volumes).

* graid(8) for redundancy and self-healing.

* geli(8) providers with continuous integrity checking.


AFAICT the FreeBSD stack is mature and production quality, which I find very appealing. But the feature set is not as sophisticated as ZFS, which leaves me wanting. Notably, I have not found a way to replicate UFS snapshots directly -- the best I can dream up is synchronizing a snapshot to a backup UFS2 filesystem and then taking a snapshot with the same name.


I am coming to the conclusion that the long-term survivability of data requires several components -- good live file system, good backups, good archives, continuous internal integrity checking with self-healing, periodic external integrity checking (e.g. mtree(1)) with some form of recovery (e.g. manual), etc.. If I get the other pieces right, I could go with OpenZFS for the live and backup systems, and worry less about data corruption bugs.


David


Reply to: