[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: incremental backups howto?



On Fri, Dec 24, 2004 at 11:40:10PM -0500, Adam Aube wrote:
> Joao Clemente wrote:
> 
> > In the latest thread about "Synchronize two servers" it was
> > talked about incremental backups. Well, can you quick-start me
> > in this topic?
> 
> An incremental backup is done by backing up all files that have changed
> since the last full or incremental backup. How this file list is tracked
> depends on the backup program used (some might use a filesystem flag,
> others might use modification timestamps).
> 
> The downside of incremental backups is that, to do a full restore, you need
> the last full backup and ALL the incremental backups since the last full.
> 
> A better alternative is a differential backup, which is all files that have
> changed since the last full backup. This is much easier to restore, because
> all you need is the last full backup and the last differential backup.
> 
> > What do you say?
> 
> Another interesting approach is that taken by tools such as dirvish or
> rsnapshot. Both of these tools use rsync to capture snapshots of a
> filesystem (either local or remote) to disk. Within the backup archives,
> files that have not changed between snapshots are hard linked.
> 
> This gives the completeness and ease of restoration of full backups without
> requiring nearly as much space to store data. To restore, just copy back
> the desired snapshot.
> 
> This is all for general filesystem backup. For databases, check the
> documentation to see what the recommended backup method is.
> 
> Adam
> 
I use something like this for my own backups. I have a large number of
files on a server which I keep backed up on another machine (the backups
have saved my ass more than once). Later versions of rsync support
automatically making hard links to unchanged files, which saves a lot of space.

What I have done is set up a special backup account on one machine and build
an passwordless ssh key that allows that machine access to the server.
Obviously there are security issues there but I'm taking a calculated risk, as
I want the backups to run from cron. There are ways to make sure that the
backup user can be restricted to only specific processes, and I think
Google can help with that.

I wrote a fairly simple bash script that creates a backup of my home folder
on the server to a folder named with the server name and the backup date. The
script runs from cron every day, and keeps one week's worth of backup folders.
It creates weekly backups as well, and keeps a certain number of those. It
similarly has montly folders. Since it uses hard links, the backup takes only
about 10% more space than any given revision, but allows me to step back a
number of days to fix something that was got broken (last time was a heavily
customized php file that I foolishly overwrote).

The script follows. I hope someone finds it useful.

#!/bin/sh

# Incremental backup script for bash, based on rsync, syncs files on server 
# to this machine. One sync is made every night, incrementals are handled
# with hardlinks to unchanged files.
#
# Once a week the newest daily snapshot becomes the newest weekly snapshot,
# and once a month the newest weekly snapshot becomes the newest monthly
# snapshot, and we will hold three months of backups. Becuase of the hardlinks
# we should be able to keep increments without wasting much more space than
# a simple full backup would already take.

# Start by setting variables: current month, dead month, current week,
# dead week, current day, and dead day
MONTH=server.monthly.`date +%G-%m`
WEEK=server.weekly.`date +%G-%V`
DAY=server.daily.`date +%G-%m-%d`
YESTERDAY=server.daily.`date -d -1day +%G-%m-%d`
DEADMONTH=server.monthly.`date -d -3month +%G-%m`
DEADWEEK=server.weekly.`date -d -4week +%G-%V`
DEADDAY=server.daily.`date -d -7day +%G-%m-%d`

# Rotate the daily backup files. Start by tossing the latest dead file,
# and then create the latest backup with rsync, using hard links.
# This happens every day
if [ -d /home/backup/$DEADDAY ]; then
   rm -rf /home/backup/$DEADDAY
fi
rsync -plrtvz --delete --rsh='ssh -c blowfish' --ignore-errors --stats --progress --link-dest=/home/backup/$YESTERDAY user@server.com:/home/user/ /home/backup/$DAY/

# Check if it is Saturday. If so, rotate the weekly backups. Start by tossing
# the latest dead file, and then copy the latest daily snapshot to the
# weekly snapshot file
if [ `date +%u` = 6 ]; then
   if [ -d /home/backup/$DEADWEEK ]; then
      rm -rf /home/backup/$DEADWEEK
   fi
   cp -al /home/backup/$DAY /home/backup/$WEEK
fi

# Check if it is the first of the month. If so, rotate the monthly backups.
# Start by tossing the latest dead file, and then copy the latest daily
# Snapshot to the monthly snapshot file.
if [ `date +%d` = 1 ]; then
   if [ -d /home/backup/$DEADMONTH ]; then
      rm -rf /home/backup/$DEADMONTH ];
   fi
   cp -al /home/backup/$DAY /home/backup/$MONTH
fi



Reply to: