On 9/2/23 12:15, Michel Verdier wrote:
On 2023-09-02, Stefan Monnier wrote:I switched to Bup a few years ago and saw a significant reduction in the size of my backups that is partly due to the deduplication *between* machines (I backup several Debian machines to the same backup repository) as well as because the deduplication occurs even when I move files around (most obvious when I move directories filled with large files like videos or music).I setup deduplication between hosts with rsnapshot as you do. But it was a small gain in my case as the larger part was users data, logs and the like. So always different between hosts. I gain only on system files. Mainly /etc as I don't backup binaries and libs. I almost never move large directories. But if needed it's easy to move it also in rsnapshot directories.
I have a SOHO LAN:* My primary workstation is Debian Xfce on a 60GB 2.5" SATA SSD with 1G boot, 1G swap, and 12G root partitions. It has one user (myself) with minimal home data (e-mail and CVS working directories). I backup boot and root.
* I keep the vast majority of my data on a FreeBSD server with Samba and the CVS repository (via SSH) on a ZFS stripe of two mirrors containing two 3TB 3.5" SATA HDD's each (e.g. 6TB RAID10). I backup the Samba data.
* I run rsync(1) and homebrew shell/ Perl scripts on the server to backup the various LAN sources to backup destination file system tree on the server. I have enabled ZFS compression on the pool and enabled deduplication on the backup tree.
I ran some statistics for the daily driver backups in March. The results were 4.9 GB backup size, 258 backups, 1.2 TB apparent total backup storage, and 29.0 GB actual total backup storage. So, a savings of about 42:1:
https://www.mail-archive.com/debian-user@lists.debian.org/msg789807.htmlToday, I collected some statistics for the backups of my data on the file server:
2023-09-02 14:10:30 toor@f3 ~ # du -hsx /jail/samba/var/local/samba/dpchrist 693G /jail/samba/var/local/samba/dpchrist 2023-09-02 14:11:09 toor@f3 ~ # ls /jail/samba/var/local/samba/dpchrist/.zfs/snapshot | wc -l 98 2023-09-02 14:13:50 toor@f3 ~ # du -hs /jail/samba/var/local/samba/dpchrist/.zfs/snapshot 67T /jail/samba/var/local/samba/dpchrist/.zfs/snapshot 2023-09-02 14:19:24 toor@f3 ~# zfs get compression,compressratio,dedup,used,usedbydataset,usedbysnapshots p3/ds2/samba/dpchrist | sort
NAME PROPERTY VALUE SOURCE p3/ds2/samba/dpchrist compression lz4 inherited from p3 p3/ds2/samba/dpchrist compressratio 1.02x - p3/ds2/samba/dpchrist dedup off default p3/ds2/samba/dpchrist used 777G - p3/ds2/samba/dpchrist usedbydataset 693G - p3/ds2/samba/dpchrist usedbysnapshots 84.2G -So, 693 GB backup size, 98 backups, 67 TB apparent total backup storage, and 777 GB actual total backup storage. So, a savings of about 88:1.
What statistics are other readers seeing for similar use-cases and their backup solutions?
David