[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Backup systems



On 9/2/23 12:15, Michel Verdier wrote:
On 2023-09-02, Stefan Monnier wrote:

I switched to Bup a few years ago and saw a significant reduction in the
size of my backups that is partly due to the deduplication *between*
machines (I backup several Debian machines to the same backup
repository) as well as because the deduplication occurs even when I move
files around (most obvious when I move directories filled with large
files like videos or music).

I setup deduplication between hosts with rsnapshot as you do. But it was
a small gain in my case as the larger part was users data, logs and the
like. So always different between hosts. I gain only on system
files. Mainly /etc as I don't backup binaries and libs.
I almost never move large directories. But if needed it's easy to move it
also in rsnapshot directories.


I have a SOHO LAN:

* My primary workstation is Debian Xfce on a 60GB 2.5" SATA SSD with 1G boot, 1G swap, and 12G root partitions. It has one user (myself) with minimal home data (e-mail and CVS working directories). I backup boot and root.

* I keep the vast majority of my data on a FreeBSD server with Samba and the CVS repository (via SSH) on a ZFS stripe of two mirrors containing two 3TB 3.5" SATA HDD's each (e.g. 6TB RAID10). I backup the Samba data.

* I run rsync(1) and homebrew shell/ Perl scripts on the server to backup the various LAN sources to backup destination file system tree on the server. I have enabled ZFS compression on the pool and enabled deduplication on the backup tree.


I ran some statistics for the daily driver backups in March. The results were 4.9 GB backup size, 258 backups, 1.2 TB apparent total backup storage, and 29.0 GB actual total backup storage. So, a savings of about 42:1:

https://www.mail-archive.com/debian-user@lists.debian.org/msg789807.html


Today, I collected some statistics for the backups of my data on the file server:

2023-09-02 14:10:30 toor@f3 ~
# du -hsx /jail/samba/var/local/samba/dpchrist
693G	/jail/samba/var/local/samba/dpchrist

2023-09-02 14:11:09 toor@f3 ~
# ls /jail/samba/var/local/samba/dpchrist/.zfs/snapshot | wc -l
      98

2023-09-02 14:13:50 toor@f3 ~
# du -hs /jail/samba/var/local/samba/dpchrist/.zfs/snapshot
 67T	/jail/samba/var/local/samba/dpchrist/.zfs/snapshot

2023-09-02 14:19:24 toor@f3 ~
# zfs get compression,compressratio,dedup,used,usedbydataset,usedbysnapshots p3/ds2/samba/dpchrist | sort
NAME                   PROPERTY         VALUE          SOURCE
p3/ds2/samba/dpchrist  compression      lz4            inherited from p3
p3/ds2/samba/dpchrist  compressratio    1.02x          -
p3/ds2/samba/dpchrist  dedup            off            default
p3/ds2/samba/dpchrist  used             777G           -
p3/ds2/samba/dpchrist  usedbydataset    693G           -
p3/ds2/samba/dpchrist  usedbysnapshots  84.2G          -


So, 693 GB backup size, 98 backups, 67 TB apparent total backup storage, and 777 GB actual total backup storage. So, a savings of about 88:1.


What statistics are other readers seeing for similar use-cases and their backup solutions?


David


Reply to: