Re: simple backup script

To: debian-user@lists.debian.org
Subject: Re: simple backup script
From: Jacob Anawalt <jacob@cachevalley.com>
Date: Mon, 01 Sep 2003 22:58:31 -0600
Message-id: <[🔎] 3F542377.3010500@cachevalley.com>
In-reply-to: <[🔎] bj0jtc$j62$1@sea.gmane.org>
References: <[🔎] bj0jtc$j62$1@sea.gmane.org>

I've read the other posts, but am starting back with your origional tokeep your full questions.


Marcus Schopen wrote:

Hi,
I'm looking for a simple backup script, which uses e.g. dd andadditionally does some error handling and mail notification. I useamanda for my daily and weekly backups, but to feel more secure, Iinstalled a second harddrive in my server today. Now I'm looking for anice and secure script, which does a full backup of the first harddiskeach day.

I guess if your 2nd hard disk is the same size or larger than the firstand you want an exact copy of everything then dd might be the way to doit. Then if hda dies you could plug-in hdb in its place and be going asof yesterday. I don't know what your goals are here. If it was just tosave yourself from a disk crash, then the RAID options suggested seemedlike a good idea. If it is to provide yet another backup because youdon't trust your amanda backups or just because you want to scriptsomething, then I'd suggest a script using tar. You didn't mention dump,and that's probably a good thing:


http://lwn.net/2001/0503/a/lt-dump.php3

Secondly I'd like to do a fullbackup of a remote server, but I'm notsure what's the best way to do a e.g. a dd over the lokal network. Maybe the obove script could use for both.
Thanks,
Marcus

Ok, are you backing up the remote server to this new hard disk or areyou backing up the server with the new hard disk to a remote location?

I'm going to run with this idea: You've got a new hard disk that you'dlike to use for a second backup system in your network, duplicating theeffort of Amanda. (Although, if Amanda has the storage capacity tobackup everything, couldn't you just backup it's backups?) You wouldlike to see a script thaht does this

I write my own backup scripts using Perl to call GNU tar. There's a lotyou can do with GNU tar, and even their documentation recomends using ascript to do your backup and restore operations rather than calling itby hand. You can get GNU tar to do full (of course), incremental, anddifferential backups (well others too, but they seem to fit within theprevious three to me.)

Lets say you're just backing up /home (expand this to whatever youwant, but if you realy just want a full copy of your hard disk, maybejust use dd) and you want to keep all your changes so you can recover toit two weeks or two months ago (assuming you have been running thebackup process for that long).

First you need to make a 'full dump' backup of home. Run tar with thelisted incremental option but the incremental file you specify shouldn'texist. This snapshot file will provide the baseline for future backups.

Next you decide if you are going to be creating incremental ordifferential backups off of this base backup. Incremental means lessarchive space used, but that to restore from scratch you need everyincremental stop point. For example if you started in January and aredong monthly incrementals then you would need to extract January (thefull) Feb, Mar, Apr, May, Jun, Jul, and August to get to September thenthe latest daily incremental off of the August dump. Doing differentialbackups means more storage space but that you need only threeextractions (under the method I will propose) to fully recover.

Here is the backup 'game plan' I am proposing for archiving disk A todisk B which is much much larger than the data you are archiving fromdisk A.

Disclaimers: My 'game plan' does not address off site storage which isvery important. It is as 'try under your own risk' as I can make it, nowarrenty. It doesn't ask you questions like "What are you backing up toprotect yourself against anyway". It doesn't address file security andprotection. It's just an answer to your question, except that it mightnot be as simple as you were asking.

I am not addressing the remote backup, but there ought to be a way to dothis with scp/ssh and add it to the script, or to a seperate script. I'minterested in hearing how others set that up because I may do it some day.

Before you look at my script below, I encourage you to check out Adumpor flexdump:


Adump - http://www.davidb.org/adump/ <http://www.davidb.org/adump/>

Flexbackup - http://flexbackup.sourceforge.net/<http://flexbackup.sourceforge.net/>

I do a full dump at least once a year. You can do it more often if youlike/have space, but you'll have to move the old full dump archive'ssnapshot file out of the way so my script will think one hasn't beendone. I do monthly differential backups off of the "yearly" full backup.I do daily differential backups off of the latest monthly backup andarchive friday's daily differentials as the 'weekly' backup, keepingfour of these in a rotating basis.

This setup allows me to: Restore from "bare metal" back to the state ofthe system at 4:00am (or whenever your cron.daily runs) for whatever Iam archiving with at most three unarchive operations. Restore or recoverfiles from yesterday or as they sat for the last four "Fridays"(Saturday morning at 4:00am). Restore or recover files as they sat atthe end of the months for as long as I have been making monthlydifferential backups and of course restore or recover files as they satthe last time I made a full backup.

I run this script out of /etc/cron.daily. You need to have change$arpath="/archive" to point to your backup location (mount point ofsecond hd in your case)


---------------------- backup.cron -----------------------------------
#!/usr/bin/perl
# /etc/cron.daily/backup.cron

use Cwd 'chdir';
#backup runs 4am so we call the run 'yesterday'
my $date;
$date = `date --date yesterday +'%b-%d-%w-%Y'`;
$date =~ s/\r?\n$//;
my ($month,$day,$weekday,$year)=split('-',$date);

my $hostname = `hostname`;
chomp($hostname);

$arpath = "/archive";
chdir($arpath) or die "Unable to chdir to $arpath\n";

`echo $date > lastbackup.txt`;

# Array to call the archive made the first day of a new month
# the cumulative archive for the previous month
# In other words, July's archive is made the first of August

my %archive_months = ('Jan'=>'Dec','Feb'=>'Jan','Mar'=>'Feb','Apr'=>'Mar','May'=>'Apr','Jul'=>'Jun','Sep'=>'Aug','Nov'=>'Oct','Jun'=>'May','Aug'=>'Jul','Oct'=>'Sep','Dec'=>'Nov');

my $current_month = $month;
my $current_year = $year;

# Shift month because on July 2nd we make the June archive.
$month = $archive_months{$current_month};

# Month has been shifted and date is always yesterday,
# so $month will be 'Dec' when the archive is run the morning
# of 2 Jan <year>. But the Dec archive is for <year - 1>, so
# we need to decrement year.
if($month eq 'Dec') { $year--; }

# $base is the archive base, used for
# tar -cf <tarname> -C <base> <folder>
# this avoids the 'warning, removing leading '/'
# from archive members and shortens the path
# depth when restoring into a local directory to
# recover a file
$base = "/home";

# @list is a list of files and directories in $base
# (ie all the user directories in /home)
@list = <$base/*>;

# $skip is a pipe delimited list of files or directories in $base
# that you dont want to archive. It can be empty ""
$skip = "public|projects";

#archive /home except the above skip list
archive_data($base,\@list,$skip);

# If you want to back up more than just /home
# (like /etc or /var/spool/mail or /usr/local)
# Then just repeat this code:
# $base = "<archive base>";
# @list = <$base/*>;
# $skip = "this|that|theother";
# archive_data($base,\@list,$skip);


# sub archive_data takes an archive base, a reference to a
# list of files or folders in that path to archive, excluding
# the $skip pipe separated list and non-directory entries.
# In other words, don't try archive just /etc/hosts.  This
# is coded to archive directories of files, not just individual
# files.

sub archive_data {
 my ($base,$list,$skip) = @_;
 foreach $ar ( @{$list} )
 {
   if($skip ne '') {
     next if ( $ar =~ /($skip)/ );
   }
   # archives directories of files, not just files
   next unless ( -d $ar );
   $ar =~ s#$base/##;

   # We have changed months, time to create the backup for the end of
   # the previous month
   if( not -e "${ar}.${month}${year}.ssf" )
   {
     my $volume_label = '';
     my $full_backup_missing = 1;

     # All monthly differentials are done off of $ar.${year}-full.ssf,
     # if it doesn't exist, we need to note that we are doing a full
     # dump into the tar volume label. Also since we won't have a full
     # to diff off of, we are making a full dump and should create the
     # $ar.${year}-full.ssf file by duplicating our months' file.
     # This will happen when Jan archives run or whenever the
     # $ar.${year}-full.ssf is missing

     if(not -e "${ar}.${year}-full.ssf") {

$volume_label = "${ar}.${month}${year}.tbz full created: $day$current_month $current_year";

     } else {
       `cp "${ar}.${year}-full.ssf" "${ar}.${month}${year}.ssf"`;

$volume_label = "${ar}.${month}${year}.tbz diff created: $day$current_month $current_year";

       $full_backup_missing = 0;
     }

     # We stash our new *.${month}${year}.tbz files into a subdirectory
     # for that month, making removing them for longterm storage
     # easier. We also put a copy of our snapshot file there.

     if(not -e "${month}${year}") {
       mkdir("${month}${year}") or warn "Unable to mkdir ${month}${year}";
     }

     if(-d "${month}${year}") {

`tar -cjf ${month}${year}/${ar}.${month}${year}.tbz -g${ar}.${month}${year}.ssf -V '$volume_label' -C $base $ar`;

       `cp ${ar}.${month}${year}.ssf ${month}${year}/`;
     } else {
       warn "File where directory should be ${month}${year}";

`tar -cjf ${ar}.${month}${year}.tbz -g ${ar}.${month}${year}.ssf-V '$volume_label' -C $base $ar`;

     }

     # Since we didn't have a full backup snapshot, we must have
     # just done a full backup
     # so make a copy of the months' snapshot as the year's full
     if($full_backup_missing) {
       `cp ${ar}.${month}${year}.ssf ${ar}.${year}-full.ssf`;
     }

     if($base eq '/home' and $ar eq 'data')
     {
       #Special case for the /home/data archive, clean out old files
       `find /home/data/mail/ -ctime +1 -exec rm {} \\;`;
       `find /home/data/log/ -ctime +1 -exec rm {} \\;`;

`find /home/data/ -ctime +1 -path"/home/data/source_*_*/delivery/*" -exec rm {} \\;`

     }


   } else {
     #We have our end of the previous month's snapshot file, we can do
     #daily and weekly differentials of everything that has changed since

#the end of that month. Copy the snapshot from the end of theprev. month

     `cp ${ar}.${month}${year}.ssf $ar.daily.ssf`;

`tar -cjf $ar.daily.tbz -g $ar.daily.ssf -V '$ar.daily.tbz diffcreated: $day $current_month $current_year' -C $base $ar`;

     if($weekday eq 5) {
       #Friday backups are renamed (moved) to .weekly. to give a weekly
       #differential backup restore point
       my $i;
       for($i=3;$i>0;$i--) {
       #rotate up to 4 weekly historical files
       my $j = $i+1;
       if(-e "$ar.weekly.$i.tbz") {
        `mv $ar.weekly.$i.tbz $ar.weekly.$j.tbz`;
       }
       if(-e "$ar.weekly.$i.ssf") {
        `mv $ar.weekly.$i.ssf $ar.weekly.$j.ssf`;
       }
     }

`mv $ar.daily.ssf $ar.weekly.1.ssf`;

       `mv $ar.daily.tbz $ar.weekly.1.tbz`;
     }
   }
 }
}

`df -h | mail root`;


---------------- end - backup.cron -----------------------------------

Over two weeks script will create files in /archive like this:

username.Sep2003.ssf
username.2003-full.ssf
username.daily.tbz
username.daily.ssf
username.weekly.1.tbz
username.weekly.1.ssf
username.weekly.2.tbz
username.weekly.2.ssf
Sep2003/
Sep2003/username.Sep2003.tbz
Sep2003/username.Sep2003.ssf

When restoring or recovering a file, you will need to start with thefull backup and then the monthly and then the daily all using the--listed-incremental (-g) option with the snapshot file that matcheseach archive. When backing these up to removable media (or networkcopying) for offsite storage, remember to copy the snapshot files (.ssf)as well. They are used when extracting to re-delete deleted files thatthe previous unarchive may have restored.

The script isn't perfect, but it works for me. Since it's run on a cron,I get an email every day listing the files that it backed up, or theerrors it encountered while tring to back stuff up. The last line sendsme another email reporting the disk usage of the filesystem(s).


Jacob

Reply to:

Follow-Ups:
- Re: simple backup script
  - From: Jacob Anawalt <jacob@cachevalley.com>
- Re: simple backup script
  - From: Kirk Strauser <kirk@strauser.com>

References:
- simple backup script
  - From: Marcus Schopen <lists@localguru.de>

Prev by Date: Re: Some odd comments and questions for all
Next by Date: Re: XFree86 Version 4.3.0 Configuration- No DFP detected
Previous by thread: Re: simple backup script
Next by thread: Re: simple backup script
Index(es):
- Date
- Thread