Re: backup of backup or alternating backups?

To: debian-user@lists.debian.org
Subject: Re: backup of backup or alternating backups?
From: Andy Smith <andy@strugglers.net>
Date: Tue, 8 Oct 2024 22:13:49 +0000
Message-id: <[🔎] ZwWunVcpkuUEvnpC@mail.bitfolk.com>
In-reply-to: <[🔎] 20241007195255.51b1c54f@tsalmoth>
References: <d61ec86d8128f14c2e6533166b0e22d5dd42a8ec.camel@gmail.com> <D4JX2V6U5BB1.XAGF3R80L0CD@debian.org> <[🔎] 9acf284d3c83ab8c1c92646015b74c4a99372a40.camel@gmail.com> <[🔎] D4OY1EQA9YXD.PBXD06GI3WA5@debian.org> <[🔎] 87o73wux8e.fsf@free.fr> <[🔎] D4PTYBOOZO3R.3O5TTWEGV1ULQ@debian.org> <[🔎] 20241007195255.51b1c54f@tsalmoth>

Hi,

On Mon, Oct 07, 2024 at 07:52:55PM -0600, Charles Curley wrote:
> I've used rsnapshot for several years now with no such issue. My
> rsnapshot repository resides on ext4, on its own LVM logical volume, on
> top of an encrypted RAID 5 array on four four terabyte spinning rust
> drives.
> 
> /crc/rsnapshot root@hawk:~# df -i /crc/rsnapshot/
> Filesystem                           Inodes IUsed IFree IUse% Mounted on
> /dev/mapper/hawk--vg--raid-rsnapshot    16M  3.2M   13M   21%

This really isn't that much data and you have four drives to spread
random reads across, so I'm not surprised that you don't really feel it
yet.

When you have hundreds of millions of files in rsnapshot it really
starts to hurt because every backup run involves:

- Deleting the oldest tree of files;
- Walking the entire tree of the most recent backup once to cp -l it and
  then;
- Walking it all again when rsync compares the new data to your previous
  iteration.

Worse, it's all small, largely random IO which is worst case for
spinning media. It easily gets to the point where the copy and compare
steps take much longer than the actual data transfer.

Other backup solutions get better performance by using some sort of
index, manifest or other database, not just by walking every inode in
the filesystem. But are then more complicated.

This rsnapshot I have is really quite slow with only two 7200rpm HDDs.
It spends way longer walking its data store than actually backing up any
data. I could definitely make it speedier by switching to something
else. But I like rsnapshot for this particular case.

$ sudo find /data/backup/rsnapshot -print0 | grep -zc '.'
202326554

(This is a btrfs filesystem which doesn't report an inode count with df
-i)

Although it probably matters most how many files you have only in the
most recent backup iteration rather than the entire rsnapshot store. For
me that is approx 5.8 million.

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting

Reply to:

Follow-Ups:
- Re: backup of backup or alternating backups?
  - From: Michel Verdier <mv524@free.fr>

References:
- Re: backup of backup or alternating backups?
  - From: Default User <hunguponcontent@gmail.com>
- Re: backup of backup or alternating backups?
  - From: "Jonathan Dowland" <jmtd@debian.org>
- Re: backup of backup or alternating backups?
  - From: Michel Verdier <mv524@free.fr>
- Re: backup of backup or alternating backups?
  - From: "Jonathan Dowland" <jmtd@debian.org>
- Re: backup of backup or alternating backups?
  - From: Charles Curley <charlescurley@charlescurley.com>

Prev by Date: Re: I/O errors during RAID check but no SMART errors
Next by Date: password manager
Previous by thread: Re: backup of backup or alternating backups?
Next by thread: Re: backup of backup or alternating backups?
Index(es):
- Date
- Thread