[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Improved anonftpsync script



[Please Cc me for replies, I'm not subscibed to this list]

Hello All,

The anonymous rsysnc mirroring script available from
http://www.debian.org/mirror/anonftpsync
has several problems:

- It fails to exclude some arch-specific stuff, namely *.changes and
  Contrents-*.gz
- It also doesn't exclude arch-specific parts of debian-installer,
  that is *.udeb and installer-*
- The "set +e" before the rsync prevents the removal of the lockfile
  to get triggered, so stale lockfiles may remain.

I guess the "set +e" was done to get a saved log in case of errors, but
moving that upwards is better, and also prevents timestamp updates when
the rsync went wrong.

In the end, I rewrote the whole script in order to fix those flaws.
It's appended at this mail, and I think it's a better example for
a mirror script. I haven't tested it on non-Debian systems, though,
so I may have missed some portability issues.


Thiemo


#! /bin/sh
#
# Copyright 2004  Thiemo Seufer <seufer@csv.ica.uni-stuttgart.de>
#
# This file is distributed under the terms of the GNU General Public License,
# Version 2, as published at http://www.gnu.org/licenses/gpl.html
#
# This script originates from http://www.debian.org/mirror/anonftpsync
#
# Note: You MUST have rsync 2.0.16-1 or newer, which is available in slink
# and all newer Debian releases, or at http://rsync.samba.org/
#
# Set the variables at the end of the file to fit your site. You can then use
# cron to have this script run daily to automatically update your copy of the
# archive.
#
# Don't forget:
# chmod 744 anonftpsync

# ---------------------------------------------------------------------------
# There should be no need to edit anything in this section, unless there are
# problems. See the end of the file for customization.
set -e

exclude_arch ()
{
	while [ $# -ge 1 ]; do
		echo -n "--exclude binary-${1}/ --exclude *_${1}.deb --exclude *_${1}.changes --exclude Contents-${1}.gz --exclude disks-${1}/ --exclude *_${1}.udeb --exclude installer-${1}/ "
		shift
	done
	return 0
}

exclude_source ()
{
	case "$EXCLUDE_SOURCE" in
	"y*" | "Y*") echo -n "--exclude source/ --exclude *.orig.tar.gz --exclude *.diff.gz --exclude *.dsc" ;;
	esac
}

do_mirror ()
{
	_rsync_host=$1
	_rsync_dir=$2
	_target_dir=$3

	# Get in the right directory and set the umask to be group writable.
	cd $HOME
	umask 002

	# Note: on some non-Debian systems, hostname doesn't accept -f option.
	# If that's the case on your system, make sure hostname prints the full
	# hostname, and remove the -f option. If there's no hostname command,
	# explicitly replace `hostname -f` with the hostname.
	_target_host=`hostname -f`

	LOCK="${_target_dir}/Archive-Update-in-Progress-${_target_host}"

	mkdir -p ${_target_dir}/project/trace

	# Check to see if another sync is in progress
	if lockfile -! -l 43200 -r 0 "$LOCK"; then
		echo ${_target_host} is unable to start rsync, lock file exists
		return 1
	fi

	# Note: on some non-Debian systems, trap doesn't accept "EXIT"
	# as signal specification. If that's the case on your system,
	# try using "0".
	trap "rm -f $LOCK > /dev/null 2>&1" EXIT

	# Note: if you don't have savelog, use any other log rotation facility, or
	# comment this out, the log will simply be overwritten each time.
	savelog ${_target_dir}/project/trace/rsync.log > /dev/null 2>&1

	rsync --recursive --links --hard-links --times --verbose \
		--compress --delete $RSYNC_OTHEROPTS \
		--exclude "Archive-Update-in-Progress-${_target_host}" \
		--exclude "project/trace/*" \
		`exclude_arch $EXCLUDE_ARCH` \
		`exclude_source` \
		$EXCLUDE_OTHER \
		${_rsync_host}::${_rsync_dir} ${_target_dir} > ${_target_dir}/project/trace/rsync.log 2>&1
	date -u > "${_target_dir}/project/trace/${_target_host}"
	rm -f $LOCK > /dev/null 2>&1

	return 0
}

# ---------------------------------------------------------------
# Customize this section

# Things to exclude.
# With blank EXCLUDE_* you will mirror the entire archive.

# This sample would exclude all architectures:
# EXCLUDE_ARCH="alpha arm m68k mips mipsel powerpc sparc i386 ia64 hppa sh s390 hurd-i386"
EXCLUDE_ARCH=

# This sample would exclude the source code:
# EXCLUDE_SOURCE="y"
EXCLUDE_SOURCE="n"

# Exclude other things, like some symlinks or sections.
# The --exclude option must be given in this case.
# Samples:
# EXCLUDE_OTHER="--exclude stable/ --exclude testing/ --exclude unstable/"
# EXCLUDE_OTHER="--exclude /contrib/ --exclude /non-free/"
#
# Exclude old distributions. Note that this won't work well for newer
# ones which have their files in a pool directory.
# EXCLUDE_OTHER="--exclude /slink/ --exclude /slink-proposed-updates/"
# EXCLUDE_OTHER="--exclude /potato/ --exclude /poatato-proposed-updates/"
EXCLUDE_OTHER=

# Additional options for rsync
# Common options might include bwlimit (but see Debian bug #181336)
# Sample:
# RSYNC_OTHEROPTS="--bwlimit=23 --safe-links --delete-after"
RSYNC_OTHEROPTS=

# Trees to mirror
#
#   do_mirror RSYNC_HOST RSYNC_DIR TARGET_DIR
#
# RSYNC_HOST is the site you have chosen from the mirrors file.
# (http://www.debian.org/mirror/list-full)
#
# RSYNC_DIR is the directory given in the "Packages over rsync:" line of
# the mirrors file for the site you have chosen to mirror.
#
# TARGET_DIR is the destination for the base of the Debian mirror directory
# (the dir that holds dists/ and ls-lR).
#
# Samples:
# do_mirror rsync.example.org debian/ /bigdisk/mirror/debian
# do_mirror rsync.example.org debian-non-US/ /bigdisk/mirror/debian-non-US
# do_mirror rsync.example.org debian-security/ /bigdisk/mirror/debian-security

# End of customize section
# ---------------------------------------------------------------
exit 0



Reply to: