Re: Home made backup system

To: debian-user@lists.debian.org
Subject: Re: Home made backup system
From: David Christensen <dpchrist@holgerdanske.com>
Date: Wed, 18 Dec 2019 21:24:18 -0800
Message-id: <[🔎] a5be7b30-2bf1-6c5b-d568-377c44e80546@holgerdanske.com>
In-reply-to: <[🔎] 201912181202.56669.rhkramer@gmail.com>
References: <[🔎] 201912181202.56669.rhkramer@gmail.com>

On 2019-12-18 09:02, rhkramer@gmail.com wrote:

Aside / Admission: I don't backup all that I should and as often as I should,
so I'm looking for ways to improve.  One thought I have is to write my own
backup "system" and use it, and I've thought about that a little, and provide
some of my thoughts below.

A purpose of sending this to the mailing-list is to find out if there already
exists a solution (or parts of a solution) close to what I'm thinking about
(no sense re-inventing the wheel), or if someone thinks I've overlooked
something or making a big mistake.

Part of the reason for doing my own is that I don't want to be trapped into
using a system that might disappear or change and leave me with a problem.  (I
subscribe to a mailing list for one particular backup system, and I wrote to
that list with my concerns and a little bit of my thoughts about my own system
(well, at the time, I was hoping for a "universal" configuration file (the file
that would specify what, where, when, how each file, directory, or partition to
be backed up would be treated), one that could be read and acted upon by a
great variety (and maybe all future backup programs).

The only response I got (iirc) was that since their program was open source,
it would never go away.  (Yet, if I'm not mixing up backup programs, they were
transitioning from using Python 2 as the underlying language to Python 3 --
I'm not sure Python 2 would ever go completely away, or become non-functional,
but it reinforces my belief / fear that any (complex?) backup program, even
open source, would someday become unusable.

So, here are my thoughts:

After I thought about (hoped for) a universal config file for backup programs
and it seeming that no such thing exists (not surprising), I thought I'd try
to create my own -- this morning as I thought about it a little more (despite
a headache and a non-working car what I should be working on), I thought that
the simplest thing for me to do is write a bash script and a bash subroutine,
something along these lines:

    * the backups should be in formats such that I can access them by a variety
of other tools (as appropriate) if I need to -- if I backup an entire
directory or partition, I should be able to easily access and restore any
particular file from within that backup, and do so even if encrypted (i.e.,
encryption would be done by "standard programs" (a bad example might be
ccrypt) that I could use "outside" of the backup system.

    * the bash subroutine (command) that I write should basically do the
following:

       * check that the specified target exists (for things like removable
drives or NAS type things) and has (sufficient) space (not sure I can tell that
until after backup is attempted) (or an encrypted drive that is not mounted /
unencrypted, i.e., available to write to)

       * if the right conditions don't exist (above) tell me (I'm thinking of
an email as email is something that always gets my attention, maybe not
immediately, but soon enough)

       * if the right conditions do exist, invoke the commands to backup the
files

       * if the backup is unsuccessful for any reason, notify me (email again)

       * optionally notify me that the backup was successful (at least to the
extent of writing something)

       * optionally actually do something to confirm that the backup is readable
/ usable (need to think about what that could be -- maybe write it (to /tmp or
to a ramdrive), do something like a checksum (e.g., sha-256 or whatever makes
sense) on it and the original file, and confirm they match

       * ???

All of the commands invoked by the script should be parameters so that the
commands can be easily changed in the future (e.g., cp / tar / rsync, sha-256
or whatever, ccrypt or whatever, etc.)

Then the master script (actually probably scripts, e.g. one or more each for
hourly, daily, weekly, ... backups) would be invoked by cron (or maybe include
the at command? --my computers run 24/7 unless they crash, but for others, at
or something similar might be a better choice) would invoke that subroutine /
command for each file, directory, or partition to be backed up, specifying the
commands to use, what files to backup, where to back them up, encrypted or not,
compressed or not, tarred or not, etc.

In other words, instead of a configuration file, the system would just use bash
scripts with the appropriate commands, and invoked at the appropriate time by
cron (or with all backup commands in one script with backup times specified
with at or similar).

Aside: even if Amanda (for example) will always exist, I don't really want to
learn anything about it or any other program that might cease to be
maintainied in the future.

I wrote and use a homebrew backup and archive solution that started witha Perl script to invoke rsync (backup) and tar/ gzip (archive) over sshfrom a central server according to configurable job files. My thinking was:

1. Use lowest-common denominator tools for backups and archives -- e.g.tar, gzip, rsync:

a. This allows use with the widest range of platforms -- myclients include GNU/Linux, FreeBSD, Windows/Cygwin, and macOS X.

b. Backup and archive contents are self-describing (live files andtar/ gzip archives).

c. The tools, and the backup and archive files, should besupported indefinitely.

2. Use ssh on the server to pull content from the clients, and lockdown sshd on the server, so that if a client is compromised, the backupsand archives are not readily accessible.

The script works, and automates what I was doing manually. It gives meease of use and consistency.



But there are drawbacks:

1. Automating tar, gzip, and rsync was the tip of the iceberg.Additional automation was needed -- run all the daily backups, run allthe daily archives, run all the weekly archives, move all archives for agiven month into a month-year-stamped directory, tar and ccencrypt thatdirectory, burn the encrypted tarballs to optical media, replicate thebackup/ archive disk to other disk(s) for near-site and off-siterotation, etc.. My homebrew solution has grown into a suite of scriptsand a repertoire of administration tasks.

2. The backups are only one and a fraction levels deep (one completebackup and a sparse tree of deleted files). This is the result ofsimplistic use of rsync's --delete and --backup-dir options. Use oftime-stamped --backup-dir's would provide multiple levels of deletions,but recovery would be messier (likely requiring yet another script). Ihave yet to investigate rsync's --link-dest option (I believe rsnapshot[1] uses this).

3. No backup/ archive metadata database of who did what, from where, towhere, when, how (command, arguments, output, errors), and why (cron,manual). But, the tar/ gzip/ rsync script captures stdout and stderr,and writes a job-date-time-stamped log file on each job run.

4. While all the code is in a version control system, I am loath totouch it. The code is old, my programming style has evolvedconsiderably, there is no functional specification, no designspecification, no test suite, no issue tracking, etc.. Everything was,and must be, designed, constructed, tested, and documented by hand.

I have been wanting to migrate to a better backup/ archive solution foryears.

Recently, I migrated my SOHO file server to FreeBSD and ZFS. I thenimplemented zfs-auto-snapshot. This was a huge improvement. I shouldbe able to do the same for /home on Debian (or perhaps there is a btrfsequivalent?). macOS -- perhaps. Windows -- I doubt it.

"Backup & Recovery" by Preston is a worthwhile read [2]. It gets youthinking about the many and inter-related issues.

Understand that there is no single solution for disaster preparedness/disaster recovery. Each tool, strategy, etc., has its strengths andweaknesses. Each administrator must tailor an overall plan thatbalances risk, effort, and investment. I like having diversity --multiple platforms, multiple technologies, multiple tools, multiplegeographic locations, etc. -- and I like having redundancy -- multiplecomputers, multiple disks, multiple media, etc..



David

p.s. Keeping working files in a version control system can providecertain backup and sneaker net features.


References:

[1]https://www.tecmint.com/rsnapshot-a-file-system-backup-utility-for-linux/


[2] http://shop.oreilly.com/product/9780596102463.do

Reply to:

References:
- Home made backup system
  - From: rhkramer@gmail.com

Prev by Date: Re: Broken PMTUD / ICMP blackhole?
Next by Date: Re: Debian on raspberrypi: failed to configure wlan0
Previous by thread: Re: Home made backup system
Next by thread: Re: Home made backup system
Index(es):
- Date
- Thread