[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Problems with making hardlink-based backups



Andrew Sackville-West wrote:
On Fri, Aug 14, 2009 at 08:43:32AM +0200, David wrote:
Thanks for your suggestion, and I have heard of rsnapshot.

Although, actually removing older snapshot directories isn't really the problem.

The problem is, if you have a large number of such backups (perhaps
one per server), then finding out where harddrive space is actually
being used, is problematic (when your backup server starts running low
on disk space).

keep each server's backup in a distinctly separate location. That
should make it clear which machines are burning up space.

du worked pretty well with rdiff-backup, but is very problematic with
a large number of hardlink-based snapshots, which each have a complete
"copy" of a massive filesystem (rather than just info on which files
changed).

but they're not copies, they're hardlinks. I guess I don't understand
the problem. In a scheme like that used by rsnapshot, a file is only
*copied* once. If it remains unchanged then the subsequent backup
directories only carry a hardlink to the file. When older backups are
deleted, the hardlinks keep the file around, but no extra room is
used. There are only *pointers* to the file lying around. Then when
the file changes, a new copy will be made and subsequent backups will
hardlink to the new file. Now you'll be using the space of two files
with different sets of hardlinks pointing to them. (I'm sure you know
this, just making sure we are on common ground).


I'm not sure I understood what you are after either. Admittedly on a rather small home server, I use the cp -alf command to have only changed files kept for a long time

this is cron.daily backup - I have cron.weekly and cron.monthly similar to this


if [ -d $ARCH/daily.6 ] ; then
	if [ ! -d $ARCH/weekly.1 ] ; then mkdir -p $ARCH/weekly.1 ; fi
# Now merge in stuff here with what might already be there using hard links
	cp -alf $ARCH/daily.6/* $ARCH/weekly.1
# Finally loose the rest
	rm -rf $ARCH/daily.6 ;

fi
# Shift along snapshots
if [ -d $ARCH/daily.5 ] ; then mv $ARCH/daily.5 $ARCH/daily.6 ; fi
if [ -d $ARCH/daily.4 ] ; then mv $ARCH/daily.4 $ARCH/daily.5 ; fi
if [ -d $ARCH/daily.3 ] ; then mv $ARCH/daily.3 $ARCH/daily.4 ; fi
if [ -d $ARCH/daily.2 ] ; then mv $ARCH/daily.2 $ARCH/daily.3 ; fi
if [ -d $ARCH/daily.1 ] ; then mv $ARCH/daily.1 $ARCH/daily.2 ; fi
if [ -d $ARCH/snap ] ; then mv $ARCH/snap $ARCH/daily.1 ; fi

# Collect new snapshot archive stuff doing daily backup on the way

mkdir -p $ARCH/snap


leading to daily backups for a week, weekly backups for a month, monthly backups until I archive them into a long term store (write a DVD - although hearing stories about issues with even these it might be easier to leave in the disk).



CDARCH=/bak/archive/CDarch-`date +%Y`


if [ -d $ARCH/monthly.6 ] ; then

	if [ ! -d $CDARCH ] ; then mkdir -p $CDARCH ; fi
	cp -alf $ARCH/monthly.6/* $CDARCH

	rm -rf $ARCH/monthly.6
fi


The backup process uses something like the following to keep an initial backup and save any changed file into this long term storage. This is just one part of the backup - other machines and other file systems use a similar mechanism with just the parameters changed.

rsync -aHqz --delete --backup --backup-dir =$ARCH/snap/freeswitch/ $MACH::freeswitch/ /bak/freeswitch/


--
Alan Chandler
http://www.chandlerfamily.org.uk


Reply to: