[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#662080: ITP: hadori -- Hardlinks identical files

Le dimanche 04 mars 2012 à 00:31 +0100, Timo Weingärtner a écrit :

>  Advantages over other hardlinking tools:
>   * predictability: arguments are scanned in order, each first version is kept
>   * much lower CPU and memory consumption
>   * hashing option: speedup on many equal-sized, mostly identical files
> The initial comparison was with hardlink, which got OOM killed with a hundred 
> backups of my home directory. Last night I compared it to duff and rdfind 
> which would have happily linked files with different st_mtime and st_mode.

There is also fdupes that does mostly the same thing. I do not know how
it compares to your program.

I, for one, would like a program that (starting from some paths on same
harddrive), would find all identical files (not considering mtime and
mode, this is for backups and I do not care), hardlink them (choosing
whatever comes first for mtime and mode), and *store the function
[filename (or inode), size, mtime] => hash*, so that files not modified
since last run are not hashed again. This would reduce drastically the
time of offline deduplication on my backup volumes (and people that
modify files without modifying mtimes should be thrown in a large lake
of boiling vinegar). Options for considering mtime and modes
discriminant are a plus (one day, I'll write this myself, if this is not
reinventing something).

If anyone happens to see such a beast, please tell me :)
Jean-Christophe Dubacq <jcdubacq1@free.fr>

Reply to: