[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Fwd: partial mirroring script]



Это скрипт для мирроринга woody с помощью rsync.


-------- Original Message --------
From: Marco d'Itri <md@Linux.IT>
Subject: partial mirroring script
Resent-From: debian-devel@lists.debian.org
To: debian-devel@lists.debian.org

This script actually works.

#!/bin/bash -e
# Anon rsync partial mirror of Debian with package pool support.
# Copyright 1999, 2000 by Joey Hess <joeyh@debian.org>, GPL'd.
# Copyright 2000 by Marco d'Itri <md@linux.it>.

#FLAGS_NODO="-n"
FLAGS_VERBOSE="--stats -v"

# Flags to pass to rsync. More can be specified on the command line.
# These flags are always passed to rsync:
FLAGS="$@ -rLpt --partial"
# These flags are not passed in when we are getting files from pools.
# In particular, --delete is a horrid idea at that point, but good here.
FLAGS_NOPOOL="$FLAGS $FLAGS_VERBOSE --exclude Packages --delete"
# And these flags are passed in only when we are getting files from pools.
# Remember, do _not_ include --delete.
FLAGS_POOL="$FLAGS $FLAGS_VERBOSE"

# The host to connect to (with rsync package name appended).
HOST=attila.bofh.it::debian
#HOST=attila.bofh.it::debian-non-US
# Where to put the mirror (absolute path, please):
DEST=/home/ftp/debian
#DEST=/home/ftp/debian-non-US

# The distribution to mirror:
DIST=woody
# Architecture to mirror:
ARCH=i386
# Should source be mirrored too?
SOURCE=no
# The sections to mirror (main, non-free, etc):
SECTIONS="main contrib non-free"
#SECTIONS="non-US/main non-US/contrib non-US/non-free"

# Should a contents file kept up to date?
CONTENTS=no
# Should symlinks be generated to every deb, in an "all" directory?
# I find this is very handy to ease looking up deb filenames.
SYMLINK_FARM=no

###############################################################################
rsync () {
	if [ "$FLAGS_VERBOSE" ]; then
		echo "======================================================="
		echo "rsync $@"
	fi
	/usr/bin/rsync "$@"
}

if [ "$SOURCE" = yes ]; then
	SOURCE=source
else
	SOURCE=""
fi

HOSTNAME=`hostname --fqdn`
LOCK="${DEST}/Archive-Update-in-Progress-${HOSTNAME}"

###############################################################################

date -u > $LOCK

# Snarf the contents file.
if [ "$CONTENTS" = yes ]; then
	mkdir -p $DEST/dists/$DIST
	rsync $FLAGS_NOPOOL $FLAGS_NODO \
		$HOST/dists/$DIST/Contents-${ARCH}.gz \
		$DEST/dists/$DIST
fi

# Download packages files (and .debs and sources too, until we move fully
# to pools).
for type in binary-all binary-${ARCH} disks-${ARCH} $SOURCE; do
	for section in $SECTIONS; do
	if [ $type = disks-${ARCH} -a $section != main ]; then continue; fi
		mkdir -p $DEST/dists/$DIST/$section/$type
		rsync $FLAGS_NOPOOL $FLAGS_NODO \
			$HOST/dists/$DIST/$section/$type \
			$DEST/dists/$DIST/$section/
	done
done

# Update the package pool.
# TODO: probably needs to be optimized, we'll see as time goes by..
mkdir -p $DEST/pool
cd $DEST/pool || exit 1
: > .filelist

# Get a list of all the files that are in the pool based on the Packages
# files that were already updated. Thanks to aj for the awk-fu.
for file in `find $DEST/dists/$DIST -name Packages.gz | \
			xargs -r zgrep -i ^Filename: | cut -d ' ' -f 2 | \
			grep ^pool/` \
	    `find $DEST/dists/$DIST -name Sources.gz | xargs -r zcat | \
			awk '/^Directory:/ {D=$2} /Files:/,/^$/ { \
			     if ($1 != "Files:" && $0 != "") print D "/" $3; \
			}' | grep ^pool/`
do
	DIRS="`dirname $file` $DIRS"
	echo $file >> .filelist
done

if [ -e .filelist -a ! -s .filelist ]; then
  echo "WARNING: empty .filelist!"
fi

# Remove leading "pool" from all files in the file list.
# The "./" we change it to is there so the file names
# exactly match in the delete step and the files that get downloaded
# are not deleted.
sed 's!^pool/!./!' .filelist > .filelist.new
mv -f .filelist.new .filelist

(cd .. && mkdir -p $DIRS)
# Tell rsync to download only the files in the list. The exclude is here 
# to make the recursion not get anything else.
rsync $FLAGS_POOL $FLAGS_NODO \
	--include '*/' --include-from .filelist --exclude '*' $HOST/pool/ .

# Delete all files that are not in the list, then any empty directories.
# This also kills the filelist.
find -type f | fgrep -vxf .filelist | xargs -r rm -f
find -type d -empty | xargs -r rmdir -p --ignore-fail-on-non-empty
# End of package pool update.

# Update symlinks (I like to have a link to every .deb in one directory).
if [ "$SYMLINK_FARM" = yes ]; then
	install -d  $DEST/all
	cd $DEST/all || exit 1
	find -name \*.deb | xargs -r rm -f
	find .. -name "*.deb" -type f | grep -v ^../all | \
		xargs -r -i ln -sf {} .
fi

# Update the timestamp
date -u > ${DEST}/project/trace/${HOSTNAME}
rm $LOCK

-- 
ciao,
Marco


-- 
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: