[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: deduplicating file systems: VDO with Debian?



On Mon, 2022-11-07 at 09:30 -0500, Dan Ritter wrote:
> didier gaumet wrote: 
> > 
> > I may be mistaken, but I think there is a confusion here about a
> > deduplication at filesystem level and at backup tool level.
> > 
> > At (linux) filesystem level, I think in-line deduplication is only provided
> > by ZFS (and perhaps, out-of-tree, BTRFS)
> 
> ZFS deduplication is a special beast that usually does not make
> people happy. It is an enterprise feature that really only works
> for special cases, and requires a lot of RAM - 1GB per 1TB of
> storage - to work. Worst of all, it cannot be gracefully turned
> off.

Only 1GB/1TB?  The FreeBSD handbook says 5--6GB per 1TB.  I could live with 1:1,
and I wouldn't need to turn it off.  The idea, in this case, is to make two
generations of backups of the "same" data without having all the disk space
needed for both of them.

> As you say, deduplication in backup systems is quite common, and works
> pretty well. There's also an on-disk non-filesystem utility, rdfind,
> which is packaged in Debian. It can discover identical files and make
> them hardlinks.

Well, if I had all the disk space to hold 2 full copies of the data to be able
to deduplicate it only later, I wouldn't need to deduplicate anything.

And how would pretending there are two backups while there's actually only one
because it got deduplicated be better than having only one backup to begin with?
(Yeah I haven't thought of that before ...)

Maybe use a snapshot to create the 2nd backup?  Or what?


Reply to: