[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Backup Times on a Linux desktop



On 04/11/19 15:41, deloptes wrote:
Not sure if true - for example you make daily, weekly and monthly backups
(classical) Lets focus on the daily part. On day 3 the files is broken.
You have to recover from day 2. The file is not broken for day 2 - correct?!

If I'm not wrong deduplication "is a technique for eliminating duplicate copies of repeating data".

I'm not a borg expert and it performs deduplication on data chunk.

Suppose that you backup 2000 files in a day and inside this backup a chunk is deduped and referenced by 300 files. If the deduped chunk is broken I think you will lost it on 300 referenced files/chunks. This is not good for me.

if your main dataset has a broken file, no problem, you can recovery from backups.

If your saved deduped chunk is broken all files that has reference to it could be broken. I think also that the same chunk will be used for successive backups (always for deduplication) so this single chunk could be used from backup1 to backupN.

It has also integrity check but don't know if check this. I read also that integrity check on bigsized dataset could require too much time.

In my mind a backup is a copy of file in window time and if needed in another window time another copy could be picked but it could not be a reference to a previous copy. Today there are people that make backups on tape (expensive) for reliability. I run backups on disks. Disks are cheap so compression (that require time in backup and restore) and deduplication (that add complexity) are not needed for me and they don't affect really my free disk space because I can add a disk.

Rsnapshot uses hardlink that is similar.

All this solutions are valid if them fit your needs. You must choose how important are data inside your backups and if losing a chunk deduped could make damage to your backup dataset in a timeline.

Ah if you have multiple server to backup, I prefer bacula because can pull data from hosts and can backup multiple server from the same point (maybe using for each client a separated bacula-sd daemon with dedicated storage).


Reply to: