[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Home made backup system



On Wed, 18 Dec 2019 12:02:56 -0500
rhkramer@gmail.com wrote:

> Aside / Admission: I don't backup all that I should and as often as I should, 
> so I'm looking for ways to improve.  One thought I have is to write my own 
> backup "system" and use it, and I've thought about that a little, and provide 
> some of my thoughts below.
> 
> A purpose of sending this to the mailing-list is to find out if there already 
> exists a solution (or parts of a solution) close to what I'm thinking about 
> (no sense re-inventing the wheel), or if someone thinks I've overlooked 
> something or making a big mistake.

There are certainly tools that do at least most of what you want. For
example, I use rsnapshot, basically a front-end to rsync that is
designed to harness rsync's power to streamline the taking of
incremental backups.

...

>    * the backups should be in formats such that I can access them by a variety 
> of other tools (as appropriate) if I need to -- if I backup an entire 
> directory or partition, I should be able to easily access and restore any 
> particular file from within that backup, and do so even if encrypted (i.e., 
> encryption would be done by "standard programs" (a bad example might be 
> ccrypt) that I could use "outside" of the backup system.

rsnapshot uses rsync + hardlinks to recreate the portions of
the filesystem that you want to back up (source) to wherever you tell it
to (target). That recreated filesystem can be accessed in any way that
the original filesystem can - no special tools are required for access
or recovery.

>    * the bash subroutine (command) that I write should basically do the 
> following:
> 
>       * check that the specified target exists (for things like removable 
> drives or NAS type things) and has (sufficient) space (not sure I can tell that 

rsnapshot does have a check for target availability. I don't think it
can check for sufficient space before initiating a backup - as you note,
it's a tricky thing to do - but it does have a 'du' option to report on
the target's current level of usage.

> until after backup is attempted) (or an encrypted drive that is not mounted / 
> unencrypted, i.e., available to write to)

>       * if the right conditions don't exist (above) tell me (I'm thinking of 
> an email as email is something that always gets my attention, maybe not 
> immediately, but soon enough)

rsnapshot will fail with an error code if something is wrong - assuming
you run it from cron, cron will email the error message.

>       * if the right conditions do exist, invoke the commands to backup the 
> files
> 
>       * if the backup is unsuccessful for any reason, notify me (email again)

As above.

>       * optionally notify me that the backup was successful (at least to the 
> extent of writing something)

By default rsnapshot prints nothing to stdout upon success (although
it does have a 'verbose' option), but it does log a 'success' message to
syslog, which I suppose you can keep an eye on with a log analyzer
(something like logwatch). Alteratively, I just reconfigured my
rsnapshot deployment to run rsnapshot with this wrapper, which results
in a notification for success but not for failure (since rsnapshot
pulls backups from the source, and in my case, the laptop it's
backing up is often not present, I would normally be flooded with
unnecessary failure notices):

*****

#!/bin/sh

# usage 'rsnapshot-script x', where 'x' is a backup interval defined in the
# rsnapshot configuration file

if nc -z lila 22 2>/dev/null
then
	echo "Running 'rsnapshot $1' ..."
	if rsnapshot $1
	then echo Success
	fi
fi

*****

>       * optionally actually do something to confirm that the backup is readable 
> / usable (need to think about what that could be -- maybe write it (to /tmp or 
> to a ramdrive), do something like a checksum (e.g., sha-256 or whatever makes 
> sense) on it and the original file, and confirm they match

rsnapshot has a hook system that allows you to add commands to be run
by it.

>       * ???
> 
> All of the commands invoked by the script should be parameters so that the 
> commands can be easily changed in the future (e.g., cp / tar / rsync, sha-256 
> or whatever, ccrypt or whatever, etc.) 

rsnapshot has configuration options 'cmd_cp', 'cmd_rm', 'cmd_rsync',
'cmd_ssh', 'cmd_logger', 'cmd_du' to do exactly that.

> Then the master script (actually probably scripts, e.g. one or more each for 
> hourly, daily, weekly, ... backups) would be invoked by cron (or maybe include 
> the at command? --my computers run 24/7 unless they crash, but for others, at 
> or something similar might be a better choice) would invoke that subroutine / 
> command for each file, directory, or partition to be backed up, specifying the 
> commands to use, what files to backup, where to back them up, encrypted or not, 
> compressed or not, tarred or not, etc.

rsnapshot does all this, via coordination with its configuration file
and cron.

> In other words, instead of a configuration file, the system would just use bash 
> scripts with the appropriate commands, and invoked at the appropriate time by 
> cron (or with all backup commands in one script with backup times specified 
> with at or similar).

I much prefer a configuration file with declarative syntax, like the
one rsnapshot uses, to hardcoding stuff into scripts, but I'm no expert,
and you are certainly entitled to your own preferences.

Celejar


Reply to: