[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)



On Mon, 2022-11-07 at 13:57 -0500, rhkramer@gmail.com wrote:
> 
> 
> I didn't (and don't) know much about deduplication (beyond what you might 
> deduce from the name), so I google and found this article which was helpful to
> me:
> 
>    * [[https://www.linkedin.com/pulse/lets-know-vdo-virtual-data-optimizer-
> ganesh-gaikwad][Lets know about VDO (virtual data optimizer)]]

That's a good pointer, but I still wonder how VDO actually works.  For example,
if I have a volume with 5TB of data on it and I write a 500kB file to that
volume a week later or whenever, and the file I'm writing is identical to
another file somewhere within the 5TB of data alreading on the volume, how does
VDO figure out that both files are identical?  ZFS does it by keeping lots of
data in memory so it can look it up right away, but VDO?  Will it write the new
file at first and check it later in the background and re-use the space later,
or will it delay the write to check it first?  Or does it do something else?


Reply to: