[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Backup Times on a Linux desktop



On Mon, 4 Nov 2019 06:01:54 -1000
Joel Roth <joelz@pobox.com> wrote:

> These days I use rsync with the --link-dest option to make
> complete Time-Machine(tm) style backups using hardlinks to
> avoid file duplication in the common case.  In this
> scenario, the top-level directory is typically named based
> on date and time, e.g. back-2019.11.04-05:32:06.

Take a look at rsnapshot. You have pretty well described it.


> 
> I usually make backups while the system is running, although
> I'm not sure it's considered kosher. It takes around 10% of
> CPU on my i5 system.

It's kosher except in a few places where referential integrity is an
issue. The classic here is a database that extends across multiple
files, which means almost all of them.

Referential integrity means keeping the data consistent. Suppose you
send an INSERT statement to a SQL database, and it affects multiple
files. The database writes to the first file. Then your backup comes
along and grabs the files for backup. Then your database writes the
other files. Your backups are broken, and you won't know it until you
restore and test.

There are work-arounds. Shut the database down during backups, or make
it read only during backups. Or tell it to accept writes from clients
but not actually write them out to the files until the backup is over.

Obviously this requires some sort of co-ordination between the backup
software and the software maintaining the files.

Or use Sqlite, which I believe avoids this issue entirely.


-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/


Reply to: