[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

dpkg and hardlinks



Hi guys,

I'm currently thinking about deduplication[1] on my Debian systems.
As you probably know, the whole thing about deduplication is that
replacing files with content with hardlink to other file(s) with the
exact same content is sometimes a good idea, at least to regain
(uselessly used) disk space.
The problem is that sometimes, files are identical at a given time, but
are meant to evolve separately. So using deduplication for data
basically depends upon how your data is actually used (don't try this at
home without knowing what you are doing!).

For files from packages, though, deduplication might be a good idea, as
dpkg is supposedly the only one to ever modify the files (under /usr for
example).
I don't know however how dpkg treats hardlinks. Does it "break" the
hardlink before replacing a file or does it replace the file whatever
its real nature is?

Some packages are particularly affected by duplication of data (example:
packages with .ppd files).


[1] http://en.wikipedia.org/wiki/Data_deduplication


Reply to: