[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: deduplicating file systems: VDO with Debian?



Am 07.11.2022 02:57, schrieb hw:
Hi,

Is there no VDO in Debian, and what would be good to use for deduplication with
Debian?  Why isn't VDO in the stardard kernel? Or is it?

I have used vdo in Debian some time ago and didn't remember big problems. AFAIR I did compile it myself - no prebuild packages.

I switched to btrfs for other reasons. Not even for performance. The VDO Layer eats performance, yes, but compared to naked ext4 even btrfs is slow.

I'm not looking for deduplication that happens some time after files have already been written like btrfs would allow: There is no point in deduplicating backups after they're done because I don't need to save disk space for them when
I can fit them in the first place.

That's only one point. And it's not really some valid one, I think, as you do typically not run into space problems with one single action (YMMV). Running multiple sessions and out-of-band deduplication between them works for me.

In-band deduplication (that's the one you want) has some drawbacks, too: High Ressource usage. You need plenty of RAM (up to several Gigabytes per Terabyte Storage) and write success is delayed (-> slow direct i/o).

For Out-of-Band deduplication there are multiple different implementations. File based dedup on directory basis can be very fast and resource economical, for example via rdfind or jdupes. Block based like via bees for btrfs (that's the one I use) is more close to in-band deduplication (including high RAM usage). Bees can be switched off and on at any time (for example if it's a small home-system which runs more demanding tasks from time to time) and switching it on again resumes at the last state (it starts at the last transaction id which was processed -> btrfs knows its transactions).

regards
hede


Reply to: