[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: deduplicating file systems: VDO with Debian?



On Mon, 2022-11-07 at 11:32 +0100, didier gaumet wrote:
> Le 07/11/2022 à 10:30, hw a écrit :
> 
> Hello,
> 
> Disclaimer: I am really almqst ignorant about deduplication
> 
> > On Mon, 2022-11-07 at 09:14 +0100, Anders Andersson wrote:
> > > On Mon, Nov 7, 2022 at 3:04 AM hw <hw@adminart.net> wrote:
> [...]
> > > You could always buy Red Hat Enterprise Linux license, sign up for a
> > > support contract, and ask if they could start supporting other operating
> > > systems? ("Each branch on this project is intended to work with a specific
> > > release of Enterprise Linux").
> > 
> > Huh?  What would that accomplish?
> 
> I think that what Anders tries to underline is that each VDO release is 
> specific to Redhat, and further, is specific to a particular Redhat 
> release. By definition that would complicate potential VDO integration 
> in Debian.

At least in theory, it should be in Centos, but if it's so specific, who knows
if it causes combatiliy issues ...

> [...]
> > Are you saying that deduplication is not
> > possible with Debian?
> 
> I may be mistaken, but I think there is a confusion here about a 
> deduplication at filesystem level and at backup tool level.
> 
> At (linux) filesystem level, I think in-line deduplication is only 
> provided by ZFS (and perhaps, out-of-tree, BTRFS)

That's what it seems like, except VDO.  Unfortunately, ZFS is said to need 5--
6GB of RAM for each 1TB of data, and that would require upgrading my server.

> I do not know precisely your usecase, but if it is to prevent 
> duplication during backup, just use a deduplicating backup tool, it just 
> do that: avoid duplicating backup objects before it could occur.
> searching for deduplicating software packaged in Debian ('apt search 
> dedup' in a terminal) and sorting backup ones would give you clues.

Actually that's a good idea I didn't think of.  But thinking about it, is that a
good idea:

When I want to have 2 (or more) generations of backups, do I actually want
deduplication?  It leaves me with only one actual copy of the data which seems
to defeat the idea of having multiple generations of backups at least to some
extent.

The question is then if it makes a difference.  It also creates the question if
I need (want) multiple generations of backups, especially when I end up with
only one copy anyway.  Hmm ...


Reply to: