[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: partial mirroring script (that actually doesn't work)



Greetings,
I administer a small partial mirror at my school and was trying to get
it to handle stable and testing. I used the script in the previous
meesage in this thread but it actually doesn't work, well not
completely. I had to make a few modifications to it and it populates the
mirror ok but I get nothing under pools. Is this because only sid is
populating the pool? If so are there going to be any packages in the
pool from testing before the next release?

Also is there anyway to check the integrity of a partial mirror? It
would be nice to know that everything is as it should be.

I attached the modified script so that it can be had as a reference.

Does anybody have plans to implement a standard method for partial
mirroring? It would seem that there is a fairly high demand for it.
Also is w.d.o/mirror going to be updated with a pool aware script
anytime soon?



-- 
Frisco Rose             "By any other name, I would smell the same"
E.O.U. Stud.             rosef@quark.eou.edu         rosef@eou.edu
Physics                  Mathematics 	          Computer Science

INTACT Director

#!/bin/bash -e
# Anon rsync partial mirror of Debian with package pool support.
# Copyright 1999, 2000 by Joey Hess <joeyh@debian.org>, GPL'd.
# Copyright 2000 by Marco d'Itri <md@linux.it>.
# Copyright 2000 by Frisco Rose <rosef@physics.eou.edu>


#FLAGS_NODO="-n"
FLAGS_VERBOSE="--stats -v"

# Flags to pass to rsync. More can be specified on the command line.
# These flags are always passed to rsync:
FLAGS="$@ -rLpt --partial"

# These flags are not passed in when we are getting files from pools.
# In particular, --delete is a horrid idea at that point, but good here.
FLAGS_NOPOOL="$FLAGS $FLAGS_VERBOSE --exclude Packages --delete"

# And these flags are passed in only when we are getting files from pools.
# Remember, do _not_ include --delete.
FLAGS_POOL="$FLAGS $FLAGS_VERBOSE"

# Until debian and debian-non-US are united we need to keep them seperate
ROOT=debian
#ROOT=debian-non-US

# The host to connect to (with rsync package name appended).
HOST=ftp.ca.debian.org::$ROOT

# Where to put the mirror (absolute path, please):
DEST=/home/ftp

# Architecture to mirror:
ARCH=i386

# Should source be mirrored too?
SOURCE=no

# The sections to mirror (main, non-free, etc):
SECTIONS="main contrib non-free"

# The distribution to mirror:
if [ $ROOT = "debian" ]; then
   DISTS="stable testing"
   DISKS="disks-${ARCH}"
else
   DISTS="stable/non-US testing/non-US"
   DISKS=""
fi

# Should a contents file kept up to date?
CONTENTS=yes

# Should symlinks be generated to every deb, in an "all" directory?
# I find this is very handy to ease looking up deb filenames.
SYMLINK_FARM=yes

###############################################################################
rsync () {
   if [ "$FLAGS_VERBOSE" ]; then
      echo "======================================================="
      echo "rsync $@"
   fi
   /usr/bin/rsync "$@"
}

if [ "$SOURCE" = yes ]; then
   SOURCE=source
else
   SOURCE=""
fi

mkdir -p $DEST/$ROOT
HOSTNAME=`hostname --fqdn`
LOCK="$DEST/Archive-Update-in-Progress-${HOSTNAME}"

###############################################################################

date -u > $LOCK

# Snarf the contents file.
if [ "$CONTENTS" = yes ]; then
   for DIST in $DISTS; do
      mkdir -p $DEST/$ROOT/dists/$DIST
      rsync $FLAGS_NOPOOL $FLAGS_NODO \
         $HOST/dists/$DIST/Contents-${ARCH}.gz \
         $DEST/$ROOT/dists/$DIST
   done
fi

# Download packages files (and .debs and sources too, until we move fully
# to pools).
for type in binary-all binary-${ARCH} $DISKS $SOURCE; do
   for section in $SECTIONS; do
   if [ $type = disks-${ARCH} -a $section != main ]; then continue; fi
      for DIST in $DISTS; do
         mkdir -p $DEST/$ROOT/dists/$DIST/$section/$type
         rsync $FLAGS_NOPOOL $FLAGS_NODO \
            $HOST/dists/$DIST/$section/$type \
            $DEST/$ROOT/dists/$DIST/$section/
      done
   done
done

# Update the package pool.
# TODO: probably needs to be optimized, we'll see as time goes by..
mkdir -p $DEST/$ROOT/pool
cd $DEST/$ROOT/pool || exit 1
: > .filelist

# Get a list of all the files that are in the pool based on the Packages
# files that were already updated. Thanks to aj for the awk-fu.
for DIST in $DISTS; do
   for file in `find $DEST/$ROOT/dists/$DIST -name Packages.gz | \
                  xargs -r zgrep -i ^Filename: | cut -d ' ' -f 2 | \
                  grep ^pool/` \
               `find $DEST/$ROOT/dists/$DIST -name Sources.gz | \
                  xargs -r zcat | awk '   /^Directory:/ \
                  {D=$2} /Files:/,/^$/ \
                  { \
                     if ($1 != "Files:" && $0 != "") \
                     print D "/" $3; \
                  }' | \
                  grep ^pool/`
   do
   DIRS="`dirname $file` $DIRS"
   echo $file >> .filelist
   done
done

if [ -e .filelist -a ! -s .filelist ]; then
  echo "WARNING: empty .filelist!"
fi

# Remove leading "pool" from all files in the file list.
# The "./" we change it to is there so the file names
# exactly match in the delete step and the files that get downloaded
# are not deleted.
sed 's!^pool/!./!' .filelist > .filelist.new
mv -f .filelist.new .filelist
(cd .. && mkdir -p $DIRS)

# Tell rsync to download only the files in the list. The exclude is here 
# to make the recursion not get anything else.
rsync $FLAGS_POOL $FLAGS_NODO \
   --include '*/' --include-from .filelist --exclude '*' $HOST/pool/ .

# Delete all files that are not in the list, then any empty directories.
# This also kills the filelist.
find -type f | fgrep -vxf .filelist | xargs -r rm -f
find -type d -empty | xargs -r rmdir -p --ignore-fail-on-non-empty

# End of package pool update.

# Update symlinks (I like to have a link to every .deb in one directory).
if [ "$SYMLINK_FARM" = yes ]; then
   install -d  $DEST/$ROOT/all
   cd $DEST/$ROOT/all || exit 1
   find -name \*.deb | xargs -r rm -f
   find .. -name "*.deb" -type f | grep -v ^../all | \
      xargs -r -i ln -sf {} .
fi

# Update the timestamp
mkdir -p $DEST/$ROOT/project/trace
date -u > $DEST/$ROOT/project/trace/${HOSTNAME}
rm $LOCK


Reply to: