On 03/13/2017 02:01 AM, Jonathan Dowland wrote:
On Fri, Mar 10, 2017 at 10:00:45PM -0800, David Christensen wrote:I'd always put a step 0) in there: is imaging what you want to do? Consider a file-level backup with rsync (etc etc, as discussed elsewhere in this thread)I do imaging for system disks. I do backups and archives for data.So having evangelised file-level copies a few times in this thread, I found myself wondering if I would have been better off with imaging this very weekend. Copying a 2.1T filesystem from an internal SATA2 disk to an external one (my regular backup drive to my once-a-month, lives off-site one) via USB3 took nearly 48 hours via "rsync -a",
2.1 TB / 48 hr / 3600 s/hr = 12.2 MB/sI was also disappointed by the transfer rate of external USB drives on Debian. Firewire is better. eSATA is best.
I now use 3 TB Seagate ST3000DM001 desktop drives in StarTech DRW115SATBK mobile docks connected to motherboard and/or HBA SATA ports. With LUKS and a Pentium D 945 (no AES-NI), I see 40 MB/s. With LUKS and a Core i7-2600S (AES-NI), I see 220 MB/s.
and the destination ended up bigger, possibly because one or more of the backups on the source had been using some kind of hardlink de-dupe (I've ranted about hardlink trees being a problem in various backup topics on -user, too...) and I didn't think to supply -S to rsync.
-S is for sparse files.Doing a quick test, it appears that rsync copies hard linked files as if each were a different file:
2017-03-13 20:33:46 dpchrist@jesse ~/sandbox/rsync $ cat hard-link #!/bin/sh # Test 'rsync -a' and hard links # $Id: hard-link,v 1.2 2017/03/14 03:33:15 dpchrist Exp $ # by David Paul Christensen dpchrist@holgerdanske.com # Public Domain rm -rf hard-link-1 rm -rf hard-link-2 mkdir hard-link-1 mkdir hard-link-2 echo "hello, world!" > hard-link-1/hello.txt ln hard-link-1/hello.txt hard-link-1/link-1.txt ln hard-link-1/hello.txt hard-link-1/link-2.txt ln hard-link-1/hello.txt hard-link-1/link-3.txt ln hard-link-1/hello.txt hard-link-1/link-4.txt ls -li hard-link-1/* du -b hard-link-1/* rsync -a hard-link-1/ hard-link-2 ls -li hard-link-2/* du -b hard-link-2/* 2017-03-13 20:34:18 dpchrist@jesse ~/sandbox/rsync $ sh hard-link 271759 -rw-r--r-- 5 dpchrist dpchrist 14 Mar 13 20:34 hard-link-1/hello.txt 271759 -rw-r--r-- 5 dpchrist dpchrist 14 Mar 13 20:34 hard-link-1/link-1.txt 271759 -rw-r--r-- 5 dpchrist dpchrist 14 Mar 13 20:34 hard-link-1/link-2.txt 271759 -rw-r--r-- 5 dpchrist dpchrist 14 Mar 13 20:34 hard-link-1/link-3.txt 271759 -rw-r--r-- 5 dpchrist dpchrist 14 Mar 13 20:34 hard-link-1/link-4.txt 14 hard-link-1/hello.txt 271760 -rw-r--r-- 1 dpchrist dpchrist 14 Mar 13 20:34 hard-link-2/hello.txt 271761 -rw-r--r-- 1 dpchrist dpchrist 14 Mar 13 20:34 hard-link-2/link-1.txt 271762 -rw-r--r-- 1 dpchrist dpchrist 14 Mar 13 20:34 hard-link-2/link-2.txt 271763 -rw-r--r-- 1 dpchrist dpchrist 14 Mar 13 20:34 hard-link-2/link-3.txt 271764 -rw-r--r-- 1 dpchrist dpchrist 14 Mar 13 20:34 hard-link-2/link-4.txt 14 hard-link-2/hello.txt 14 hard-link-2/link-1.txt 14 hard-link-2/link-2.txt 14 hard-link-2/link-3.txt 14 hard-link-2/link-4.txtIs anyone aware of a utility that can walk a file system and replace identical files with hard links?
The real test will be how long an incremental catch-up will take in the future.
For new large files, the size of the files divided by 12.2 MB/s. For everything else, longer.
David