[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: simple backup script



I've read the other posts, but am starting back with your origional to keep your full questions.

Marcus Schopen wrote:

Hi,

I'm looking for a simple backup script, which uses e.g. dd and additionally does some error handling and mail notification. I use amanda for my daily and weekly backups, but to feel more secure, I installed a second harddrive in my server today. Now I'm looking for a nice and secure script, which does a full backup of the first harddisk each day.

I guess if your 2nd hard disk is the same size or larger than the first and you want an exact copy of everything then dd might be the way to do it. Then if hda dies you could plug-in hdb in its place and be going as of yesterday. I don't know what your goals are here. If it was just to save yourself from a disk crash, then the RAID options suggested seemed like a good idea. If it is to provide yet another backup because you don't trust your amanda backups or just because you want to script something, then I'd suggest a script using tar. You didn't mention dump, and that's probably a good thing:

http://lwn.net/2001/0503/a/lt-dump.php3



Secondly I'd like to do a fullbackup of a remote server, but I'm not sure what's the best way to do a e.g. a dd over the lokal network. May be the obove script could use for both.

Thanks,
Marcus



Ok, are you backing up the remote server to this new hard disk or are you backing up the server with the new hard disk to a remote location?

I'm going to run with this idea: You've got a new hard disk that you'd like to use for a second backup system in your network, duplicating the effort of Amanda. (Although, if Amanda has the storage capacity to backup everything, couldn't you just backup it's backups?) You would like to see a script thaht does this

I write my own backup scripts using Perl to call GNU tar. There's a lot you can do with GNU tar, and even their documentation recomends using a script to do your backup and restore operations rather than calling it by hand. You can get GNU tar to do full (of course), incremental, and differential backups (well others too, but they seem to fit within the previous three to me.)

Lets say you're just backing up /home (expand this to whatever you want, but if you realy just want a full copy of your hard disk, maybe just use dd) and you want to keep all your changes so you can recover to it two weeks or two months ago (assuming you have been running the backup process for that long).

First you need to make a 'full dump' backup of home. Run tar with the listed incremental option but the incremental file you specify shouldn't exist. This snapshot file will provide the baseline for future backups.

Next you decide if you are going to be creating incremental or differential backups off of this base backup. Incremental means less archive space used, but that to restore from scratch you need every incremental stop point. For example if you started in January and are dong monthly incrementals then you would need to extract January (the full) Feb, Mar, Apr, May, Jun, Jul, and August to get to September then the latest daily incremental off of the August dump. Doing differential backups means more storage space but that you need only three extractions (under the method I will propose) to fully recover.

Here is the backup 'game plan' I am proposing for archiving disk A to disk B which is much much larger than the data you are archiving from disk A.

Disclaimers: My 'game plan' does not address off site storage which is very important. It is as 'try under your own risk' as I can make it, no warrenty. It doesn't ask you questions like "What are you backing up to protect yourself against anyway". It doesn't address file security and protection. It's just an answer to your question, except that it might not be as simple as you were asking.

I am not addressing the remote backup, but there ought to be a way to do this with scp/ssh and add it to the script, or to a seperate script. I'm interested in hearing how others set that up because I may do it some day.

Before you look at my script below, I encourage you to check out Adump or flexdump:

Adump - http://www.davidb.org/adump/ <http://www.davidb.org/adump/>
Flexbackup - http://flexbackup.sourceforge.net/ <http://flexbackup.sourceforge.net/>

I do a full dump at least once a year. You can do it more often if you like/have space, but you'll have to move the old full dump archive's snapshot file out of the way so my script will think one hasn't been done. I do monthly differential backups off of the "yearly" full backup. I do daily differential backups off of the latest monthly backup and archive friday's daily differentials as the 'weekly' backup, keeping four of these in a rotating basis.

This setup allows me to: Restore from "bare metal" back to the state of the system at 4:00am (or whenever your cron.daily runs) for whatever I am archiving with at most three unarchive operations. Restore or recover files from yesterday or as they sat for the last four "Fridays" (Saturday morning at 4:00am). Restore or recover files as they sat at the end of the months for as long as I have been making monthly differential backups and of course restore or recover files as they sat the last time I made a full backup.

I run this script out of /etc/cron.daily. You need to have change $arpath="/archive" to point to your backup location (mount point of second hd in your case)

---------------------- backup.cron -----------------------------------
#!/usr/bin/perl
# /etc/cron.daily/backup.cron

use Cwd 'chdir';
#backup runs 4am so we call the run 'yesterday'
my $date;
$date = `date --date yesterday +'%b-%d-%w-%Y'`;
$date =~ s/\r?\n$//;
my ($month,$day,$weekday,$year)=split('-',$date);

my $hostname = `hostname`;
chomp($hostname);

$arpath = "/archive";
chdir($arpath) or die "Unable to chdir to $arpath\n";

`echo $date > lastbackup.txt`;

# Array to call the archive made the first day of a new month
# the cumulative archive for the previous month
# In other words, July's archive is made the first of August
my %archive_months = ( 'Jan'=>'Dec','Feb'=>'Jan','Mar'=>'Feb','Apr'=>'Mar','May'=>'Apr','Jul'=>'Jun','Sep'=>'Aug','Nov'=>'Oct','Jun'=>'May','Aug'=>'Jul','Oct'=>'Sep','Dec'=>'Nov');
my $current_month = $month;
my $current_year = $year;

# Shift month because on July 2nd we make the June archive.
$month = $archive_months{$current_month};

# Month has been shifted and date is always yesterday,
# so $month will be 'Dec' when the archive is run the morning
# of 2 Jan <year>. But the Dec archive is for <year - 1>, so
# we need to decrement year.
if($month eq 'Dec') { $year--; }

# $base is the archive base, used for
# tar -cf <tarname> -C <base> <folder>
# this avoids the 'warning, removing leading '/'
# from archive members and shortens the path
# depth when restoring into a local directory to
# recover a file
$base = "/home";

# @list is a list of files and directories in $base
# (ie all the user directories in /home)
@list = <$base/*>;

# $skip is a pipe delimited list of files or directories in $base
# that you dont want to archive. It can be empty ""
$skip = "public|projects";

#archive /home except the above skip list
archive_data($base,\@list,$skip);

# If you want to back up more than just /home
# (like /etc or /var/spool/mail or /usr/local)
# Then just repeat this code:
# $base = "<archive base>";
# @list = <$base/*>;
# $skip = "this|that|theother";
# archive_data($base,\@list,$skip);


# sub archive_data takes an archive base, a reference to a
# list of files or folders in that path to archive, excluding
# the $skip pipe separated list and non-directory entries.
# In other words, don't try archive just /etc/hosts.  This
# is coded to archive directories of files, not just individual
# files.

sub archive_data {
 my ($base,$list,$skip) = @_;
 foreach $ar ( @{$list} )
 {
   if($skip ne '') {
     next if ( $ar =~ /($skip)/ );
   }
   # archives directories of files, not just files
   next unless ( -d $ar );
   $ar =~ s#$base/##;

   # We have changed months, time to create the backup for the end of
   # the previous month
   if( not -e "${ar}.${month}${year}.ssf" )
   {
     my $volume_label = '';
     my $full_backup_missing = 1;

     # All monthly differentials are done off of $ar.${year}-full.ssf,
     # if it doesn't exist, we need to note that we are doing a full
     # dump into the tar volume label. Also since we won't have a full
     # to diff off of, we are making a full dump and should create the
     # $ar.${year}-full.ssf file by duplicating our months' file.
     # This will happen when Jan archives run or whenever the
     # $ar.${year}-full.ssf is missing

     if(not -e "${ar}.${year}-full.ssf") {
$volume_label = "${ar}.${month}${year}.tbz full created: $day $current_month $current_year";
     } else {
       `cp "${ar}.${year}-full.ssf" "${ar}.${month}${year}.ssf"`;
$volume_label = "${ar}.${month}${year}.tbz diff created: $day $current_month $current_year";
       $full_backup_missing = 0;
     }

     # We stash our new *.${month}${year}.tbz files into a subdirectory
     # for that month, making removing them for longterm storage
     # easier. We also put a copy of our snapshot file there.

     if(not -e "${month}${year}") {
       mkdir("${month}${year}") or warn "Unable to mkdir ${month}${year}";
     }

     if(-d "${month}${year}") {
`tar -cjf ${month}${year}/${ar}.${month}${year}.tbz -g ${ar}.${month}${year}.ssf -V '$volume_label' -C $base $ar`;
       `cp ${ar}.${month}${year}.ssf ${month}${year}/`;
     } else {
       warn "File where directory should be ${month}${year}";
`tar -cjf ${ar}.${month}${year}.tbz -g ${ar}.${month}${year}.ssf -V '$volume_label' -C $base $ar`;
     }

     # Since we didn't have a full backup snapshot, we must have
     # just done a full backup
     # so make a copy of the months' snapshot as the year's full
     if($full_backup_missing) {
       `cp ${ar}.${month}${year}.ssf ${ar}.${year}-full.ssf`;
     }

     if($base eq '/home' and $ar eq 'data')
     {
       #Special case for the /home/data archive, clean out old files
       `find /home/data/mail/ -ctime +1 -exec rm {} \\;`;
       `find /home/data/log/ -ctime +1 -exec rm {} \\;`;
`find /home/data/ -ctime +1 -path "/home/data/source_*_*/delivery/*" -exec rm {} \\;`
     }


   } else {
     #We have our end of the previous month's snapshot file, we can do
     #daily and weekly differentials of everything that has changed since
#the end of that month. Copy the snapshot from the end of the prev. month
     `cp ${ar}.${month}${year}.ssf $ar.daily.ssf`;
`tar -cjf $ar.daily.tbz -g $ar.daily.ssf -V '$ar.daily.tbz diff created: $day $current_month $current_year' -C $base $ar`;
     if($weekday eq 5) {
       #Friday backups are renamed (moved) to .weekly. to give a weekly
       #differential backup restore point
       my $i;
       for($i=3;$i>0;$i--) {
       #rotate up to 4 weekly historical files
       my $j = $i+1;
       if(-e "$ar.weekly.$i.tbz") {
        `mv $ar.weekly.$i.tbz $ar.weekly.$j.tbz`;
       }
       if(-e "$ar.weekly.$i.ssf") {
        `mv $ar.weekly.$i.ssf $ar.weekly.$j.ssf`;
       }
     }
`mv $ar.daily.ssf $ar.weekly.1.ssf`;
       `mv $ar.daily.tbz $ar.weekly.1.tbz`;
     }
   }
 }
}

`df -h | mail root`;


---------------- end - backup.cron -----------------------------------

Over two weeks script will create files in /archive like this:

username.Sep2003.ssf
username.2003-full.ssf
username.daily.tbz
username.daily.ssf
username.weekly.1.tbz
username.weekly.1.ssf
username.weekly.2.tbz
username.weekly.2.ssf
Sep2003/
Sep2003/username.Sep2003.tbz
Sep2003/username.Sep2003.ssf

When restoring or recovering a file, you will need to start with the full backup and then the monthly and then the daily all using the --listed-incremental (-g) option with the snapshot file that matches each archive. When backing these up to removable media (or network copying) for offsite storage, remember to copy the snapshot files (.ssf) as well. They are used when extracting to re-delete deleted files that the previous unarchive may have restored.

The script isn't perfect, but it works for me. Since it's run on a cron, I get an email every day listing the files that it backed up, or the errors it encountered while tring to back stuff up. The last line sends me another email reporting the disk usage of the filesystem(s).

Jacob



Reply to: