[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#509685: ITP: hardlink -- Hardlink multiple copies of the same file



On Thu, Jan 15, 2009 at 15:10, Andrew Vaughan <ajv-lists@netspace.net.au> wrote:
> On Fri, 26 Dec 2008, John Goerzen wrote:
>> Julian Andres Klode wrote:
>> >  Hardlink is a tool which detects multiple copies of the same file and
>> > replaces them with hardlinks.
>> >  .
>> >  The idea has been taken from http://code.google.com/p/hardlinkpy/, but
>> > the code has been written from scratch and licensed under the MIT
>> > license.
>>
>> Do we really need another tool like this?
>>
>> We already have these packages:
>>
>>   fdupes
>>   perforate
>>
> Hi John
>
> I think that's a little harsh.  There are lots of apps in Debian that
> provide similar functionality to other apps in Debian.  I do agree that iff
> hardlink is only duplicating functionality available in finddup, then there
> is no point in maintaining both.

are you mapping "finddup" (and all other way you call it in this
email) to "fdupes"? If so, read on, if not, drop.

> Finddup assumes that the file list will fit in memory.  This is a
> showstopper for me.  Attempting to run finddup on my home server over a
> partial backup set of a single day (1,898,219 files) resulted in
> unacceptable memory usage (739MB after 4 hours on a machine with 512MB
> physical ram.  This resulted in swap usage of over 600MB, and a 30 sec ssh
> login time).

Do you have a better algorithm to achieve the same task without having
in memory the whole file list? if yes, write a patch and make fdupes a
better tool. Only complaining is not an option.

> Findup lacks an option to require matching timestamps before hardlinking.
> This discards info that can be useful in a backup, and results in rsync
> thinking that the files have changed, and retransmitting them anyway.

Patches are welcome.

> Finddup's syntax for specifying directories to link is clumsy when what I
> really want to link is /srv/dirvish/*/2009.01.1*/tree.

If you know how to make it nicer, write a patch and send it.

Regards,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


Reply to: