Re: simple backup script
I've read the other posts, but am starting back with your origional to
keep your full questions.
Marcus Schopen wrote:
Hi,
I'm looking for a simple backup script, which uses e.g. dd and
additionally does some error handling and mail notification. I use
amanda for my daily and weekly backups, but to feel more secure, I
installed a second harddrive in my server today. Now I'm looking for a
nice and secure script, which does a full backup of the first harddisk
each day.
I guess if your 2nd hard disk is the same size or larger than the first
and you want an exact copy of everything then dd might be the way to do
it. Then if hda dies you could plug-in hdb in its place and be going as
of yesterday. I don't know what your goals are here. If it was just to
save yourself from a disk crash, then the RAID options suggested seemed
like a good idea. If it is to provide yet another backup because you
don't trust your amanda backups or just because you want to script
something, then I'd suggest a script using tar. You didn't mention dump,
and that's probably a good thing:
http://lwn.net/2001/0503/a/lt-dump.php3
Secondly I'd like to do a fullbackup of a remote server, but I'm not
sure what's the best way to do a e.g. a dd over the lokal network. May
be the obove script could use for both.
Thanks,
Marcus
Ok, are you backing up the remote server to this new hard disk or are
you backing up the server with the new hard disk to a remote location?
I'm going to run with this idea: You've got a new hard disk that you'd
like to use for a second backup system in your network, duplicating the
effort of Amanda. (Although, if Amanda has the storage capacity to
backup everything, couldn't you just backup it's backups?) You would
like to see a script thaht does this
I write my own backup scripts using Perl to call GNU tar. There's a lot
you can do with GNU tar, and even their documentation recomends using a
script to do your backup and restore operations rather than calling it
by hand. You can get GNU tar to do full (of course), incremental, and
differential backups (well others too, but they seem to fit within the
previous three to me.)
Lets say you're just backing up /home (expand this to whatever you
want, but if you realy just want a full copy of your hard disk, maybe
just use dd) and you want to keep all your changes so you can recover to
it two weeks or two months ago (assuming you have been running the
backup process for that long).
First you need to make a 'full dump' backup of home. Run tar with the
listed incremental option but the incremental file you specify shouldn't
exist. This snapshot file will provide the baseline for future backups.
Next you decide if you are going to be creating incremental or
differential backups off of this base backup. Incremental means less
archive space used, but that to restore from scratch you need every
incremental stop point. For example if you started in January and are
dong monthly incrementals then you would need to extract January (the
full) Feb, Mar, Apr, May, Jun, Jul, and August to get to September then
the latest daily incremental off of the August dump. Doing differential
backups means more storage space but that you need only three
extractions (under the method I will propose) to fully recover.
Here is the backup 'game plan' I am proposing for archiving disk A to
disk B which is much much larger than the data you are archiving from
disk A.
Disclaimers: My 'game plan' does not address off site storage which is
very important. It is as 'try under your own risk' as I can make it, no
warrenty. It doesn't ask you questions like "What are you backing up to
protect yourself against anyway". It doesn't address file security and
protection. It's just an answer to your question, except that it might
not be as simple as you were asking.
I am not addressing the remote backup, but there ought to be a way to do
this with scp/ssh and add it to the script, or to a seperate script. I'm
interested in hearing how others set that up because I may do it some day.
Before you look at my script below, I encourage you to check out Adump
or flexdump:
Adump - http://www.davidb.org/adump/ <http://www.davidb.org/adump/>
Flexbackup - http://flexbackup.sourceforge.net/
<http://flexbackup.sourceforge.net/>
I do a full dump at least once a year. You can do it more often if you
like/have space, but you'll have to move the old full dump archive's
snapshot file out of the way so my script will think one hasn't been
done. I do monthly differential backups off of the "yearly" full backup.
I do daily differential backups off of the latest monthly backup and
archive friday's daily differentials as the 'weekly' backup, keeping
four of these in a rotating basis.
This setup allows me to: Restore from "bare metal" back to the state of
the system at 4:00am (or whenever your cron.daily runs) for whatever I
am archiving with at most three unarchive operations. Restore or recover
files from yesterday or as they sat for the last four "Fridays"
(Saturday morning at 4:00am). Restore or recover files as they sat at
the end of the months for as long as I have been making monthly
differential backups and of course restore or recover files as they sat
the last time I made a full backup.
I run this script out of /etc/cron.daily. You need to have change
$arpath="/archive" to point to your backup location (mount point of
second hd in your case)
---------------------- backup.cron -----------------------------------
#!/usr/bin/perl
# /etc/cron.daily/backup.cron
use Cwd 'chdir';
#backup runs 4am so we call the run 'yesterday'
my $date;
$date = `date --date yesterday +'%b-%d-%w-%Y'`;
$date =~ s/\r?\n$//;
my ($month,$day,$weekday,$year)=split('-',$date);
my $hostname = `hostname`;
chomp($hostname);
$arpath = "/archive";
chdir($arpath) or die "Unable to chdir to $arpath\n";
`echo $date > lastbackup.txt`;
# Array to call the archive made the first day of a new month
# the cumulative archive for the previous month
# In other words, July's archive is made the first of August
my %archive_months = (
'Jan'=>'Dec','Feb'=>'Jan','Mar'=>'Feb','Apr'=>'Mar','May'=>'Apr','Jul'=>'Jun','Sep'=>'Aug','Nov'=>'Oct','Jun'=>'May','Aug'=>'Jul','Oct'=>'Sep','Dec'=>'Nov');
my $current_month = $month;
my $current_year = $year;
# Shift month because on July 2nd we make the June archive.
$month = $archive_months{$current_month};
# Month has been shifted and date is always yesterday,
# so $month will be 'Dec' when the archive is run the morning
# of 2 Jan <year>. But the Dec archive is for <year - 1>, so
# we need to decrement year.
if($month eq 'Dec') { $year--; }
# $base is the archive base, used for
# tar -cf <tarname> -C <base> <folder>
# this avoids the 'warning, removing leading '/'
# from archive members and shortens the path
# depth when restoring into a local directory to
# recover a file
$base = "/home";
# @list is a list of files and directories in $base
# (ie all the user directories in /home)
@list = <$base/*>;
# $skip is a pipe delimited list of files or directories in $base
# that you dont want to archive. It can be empty ""
$skip = "public|projects";
#archive /home except the above skip list
archive_data($base,\@list,$skip);
# If you want to back up more than just /home
# (like /etc or /var/spool/mail or /usr/local)
# Then just repeat this code:
# $base = "<archive base>";
# @list = <$base/*>;
# $skip = "this|that|theother";
# archive_data($base,\@list,$skip);
# sub archive_data takes an archive base, a reference to a
# list of files or folders in that path to archive, excluding
# the $skip pipe separated list and non-directory entries.
# In other words, don't try archive just /etc/hosts. This
# is coded to archive directories of files, not just individual
# files.
sub archive_data {
my ($base,$list,$skip) = @_;
foreach $ar ( @{$list} )
{
if($skip ne '') {
next if ( $ar =~ /($skip)/ );
}
# archives directories of files, not just files
next unless ( -d $ar );
$ar =~ s#$base/##;
# We have changed months, time to create the backup for the end of
# the previous month
if( not -e "${ar}.${month}${year}.ssf" )
{
my $volume_label = '';
my $full_backup_missing = 1;
# All monthly differentials are done off of $ar.${year}-full.ssf,
# if it doesn't exist, we need to note that we are doing a full
# dump into the tar volume label. Also since we won't have a full
# to diff off of, we are making a full dump and should create the
# $ar.${year}-full.ssf file by duplicating our months' file.
# This will happen when Jan archives run or whenever the
# $ar.${year}-full.ssf is missing
if(not -e "${ar}.${year}-full.ssf") {
$volume_label = "${ar}.${month}${year}.tbz full created: $day
$current_month $current_year";
} else {
`cp "${ar}.${year}-full.ssf" "${ar}.${month}${year}.ssf"`;
$volume_label = "${ar}.${month}${year}.tbz diff created: $day
$current_month $current_year";
$full_backup_missing = 0;
}
# We stash our new *.${month}${year}.tbz files into a subdirectory
# for that month, making removing them for longterm storage
# easier. We also put a copy of our snapshot file there.
if(not -e "${month}${year}") {
mkdir("${month}${year}") or warn "Unable to mkdir ${month}${year}";
}
if(-d "${month}${year}") {
`tar -cjf ${month}${year}/${ar}.${month}${year}.tbz -g
${ar}.${month}${year}.ssf -V '$volume_label' -C $base $ar`;
`cp ${ar}.${month}${year}.ssf ${month}${year}/`;
} else {
warn "File where directory should be ${month}${year}";
`tar -cjf ${ar}.${month}${year}.tbz -g ${ar}.${month}${year}.ssf
-V '$volume_label' -C $base $ar`;
}
# Since we didn't have a full backup snapshot, we must have
# just done a full backup
# so make a copy of the months' snapshot as the year's full
if($full_backup_missing) {
`cp ${ar}.${month}${year}.ssf ${ar}.${year}-full.ssf`;
}
if($base eq '/home' and $ar eq 'data')
{
#Special case for the /home/data archive, clean out old files
`find /home/data/mail/ -ctime +1 -exec rm {} \\;`;
`find /home/data/log/ -ctime +1 -exec rm {} \\;`;
`find /home/data/ -ctime +1 -path
"/home/data/source_*_*/delivery/*" -exec rm {} \\;`
}
} else {
#We have our end of the previous month's snapshot file, we can do
#daily and weekly differentials of everything that has changed since
#the end of that month. Copy the snapshot from the end of the
prev. month
`cp ${ar}.${month}${year}.ssf $ar.daily.ssf`;
`tar -cjf $ar.daily.tbz -g $ar.daily.ssf -V '$ar.daily.tbz diff
created: $day $current_month $current_year' -C $base $ar`;
if($weekday eq 5) {
#Friday backups are renamed (moved) to .weekly. to give a weekly
#differential backup restore point
my $i;
for($i=3;$i>0;$i--) {
#rotate up to 4 weekly historical files
my $j = $i+1;
if(-e "$ar.weekly.$i.tbz") {
`mv $ar.weekly.$i.tbz $ar.weekly.$j.tbz`;
}
if(-e "$ar.weekly.$i.ssf") {
`mv $ar.weekly.$i.ssf $ar.weekly.$j.ssf`;
}
}
`mv $ar.daily.ssf $ar.weekly.1.ssf`;
`mv $ar.daily.tbz $ar.weekly.1.tbz`;
}
}
}
}
`df -h | mail root`;
---------------- end - backup.cron -----------------------------------
Over two weeks script will create files in /archive like this:
username.Sep2003.ssf
username.2003-full.ssf
username.daily.tbz
username.daily.ssf
username.weekly.1.tbz
username.weekly.1.ssf
username.weekly.2.tbz
username.weekly.2.ssf
Sep2003/
Sep2003/username.Sep2003.tbz
Sep2003/username.Sep2003.ssf
When restoring or recovering a file, you will need to start with the
full backup and then the monthly and then the daily all using the
--listed-incremental (-g) option with the snapshot file that matches
each archive. When backing these up to removable media (or network
copying) for offsite storage, remember to copy the snapshot files (.ssf)
as well. They are used when extracting to re-delete deleted files that
the previous unarchive may have restored.
The script isn't perfect, but it works for me. Since it's run on a cron,
I get an email every day listing the files that it backed up, or the
errors it encountered while tring to back stuff up. The last line sends
me another email reporting the disk usage of the filesystem(s).
Jacob
Reply to: