[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ..mirror script: woody deb mirror for i386, how to exclude the rest?



Monday, September 08, 2003 1:50 AM  "Arnt Karlsen" <arnt@c2i.net> wrote:


> Hi,
>
> ..in my mirror I like main, non-US, non-free and contrib for
> Woody/3.0r1.  So I try to script a mirror for i386 Woody,
> should make a nice 4.2 GB mirror, how do I exclude the rest
> of the about 80 GB?:
>
> ..the non-US is reasonable: du -sh debian* ...
> 17G     debian
> 119M    debian-non-US
>
> ...but the 17G is 4 times what it should be, so I stopped it.
> I goofed, but _where_ did I goof?  Clue whack, please.

I don't believe you can get anonftpsync to exclude/include releases
unless you code something to read the Packages.gz for that release,
and exclude everything not listed by passing a large exclusion list to
rsync or calling rsync on multiple subsets. I'm sure it's even more
complicated than that, I haven't looked into doing it.

If all you are trying to do is save some bandwidth across a bunch of
systems I suggest looking into something like apt-cacher, aptcached,
apt-proxy2, apt-www-proxy and debproxy. I like apt-cacher.

If you are trying to keep a local mirror to burn to CD-ROMs or some
other reason that you want to access the .deb files directly and you
_only_ want Woody, then I suggest looking for another non-rsync
solution, or rsync once, discard what you don't want and don't rsync
again. Maybe someone has made up an include/exclude list for rsync
for just Woody 3.0r1.

I've set up the anonftpsync script. The initial sync took ages and when
my weekly sync runs it takes a couple of hours. It is 14GB and this is
just i386 and 'all' (non arch specific) without source. I've included my
EXCLUDE rules after yours below.

> Script below:
> cat `which  anonftpsync `
[snip]
>
> # EXCLUDE is a list of parameters listing patterns that rsync will
> exclude.# The following example would exclude mostly everything:
> #EXCLUDE="\
> #  --exclude binary-alpha/ --exclude binary-arm/ --exclude binary-i386/
> \#  --exclude binary-m68k/ --exclude binary-powerpc/ --exclude
> binary-sparc/ \#  --exclude binary-ia64/ --exclude binary-mips*/
> --exclude binary-hppa/ \#  --exclude binary-sh/ --exclude binary-s390/ \
> #  --exclude binary-hurd-i386/ \
> #  --exclude *_alpha.deb --exclude *_arm.deb --exclude *_i386.deb \
> #  --exclude *_m68k.deb --exclude *_powerpc.deb --exclude *_sparc.deb \
> #  --exclude *_ia64.deb --exclude *_hppa.deb --exclude *_sh.deb \
> #  --exclude *_mips.deb --exclude *_mipsel.deb --exclude *_s390.deb \
> #  --exclude *_hurd-i386.deb \
> #  --exclude disks-alpha/ --exclude disks-arm/ --exclude disks-i386/ \
> #  --exclude disks-ia64/ --exclude disks-m68k/ --exclude disks-mips*/  \
> #  --exclude disks-powerpc/  --exclude disks-s390/  --exclude
> disks-sparc/ \#  --exclude stable/ --exclude testing/ --exclude
> unstable/ \#  --exclude source/ \
> #  --exclude *.orig.tar.gz --exclude *.diff.gz --exclude *.dsc \
> #  --exclude /contrib/ --exclude /non-free/ \
> # "
>
> # With a blank EXCLUDE you will mirror the entire archive.
>
> EXCLUDE="\
> --exclude binary-alpha/ --exclude binary-arm/ \
> --exclude binary-m68k/ --exclude binary-powerpc/ --exclude binary-sparc/
> \--exclude binary-ia64/ --exclude binary-mips*/ --exclude binary-hppa/ \
> --exclude binary-sh/ --exclude binary-s390/ \
> --exclude binary-hurd-i386/ \
> --exclude *_alpha.deb --exclude *_arm.deb \
> --exclude *_m68k.deb --exclude *_powerpc.deb --exclude *_sparc.deb \
> --exclude *_ia64.deb --exclude *_hppa.deb --exclude *_sh.deb \
> --exclude *_mips.deb --exclude *_mipsel.deb --exclude *_s390.deb \
> --exclude *_hurd-i386.deb \
> --exclude disks-alpha/ --exclude disks-arm/ --exclude disks-i386/ \
> --exclude disks-ia64/ --exclude disks-m68k/ --exclude disks-mips*/  \
> --exclude disks-powerpc/  --exclude disks-s390/  --exclude disks-sparc/
> \--exclude testing/ --exclude unstable/ \
> --exclude source/ \
> --exclude *.orig.tar.gz --exclude *.diff.gz --exclude *.dsc \
> "

# The exclusion list is sort of like the iptables rules. First match
include/exclude is
# used and the rest of the chain is skipped. I think it is better to include
what you
# want and exclude everything else than it is to exclude a hundred things
and miss
# 50 more, eg. when a new arch is added.
EXCLUDE="
 --include Contents-i386.gz --exclude Contents-*.gz \
 --include binary-i386/ --exclude binary-*/ \
 --include disks-i386/ --exclude disks-*/ \
 --include *_i386.deb --include *_all.deb --exclude *_*.deb \
 --include *_i386.udeb --include *_all.udeb --exclude *_*.udeb \
 --include *_i386.changes --include *_all.changes --exclude *.changes \
 --exclude source/ --exclude *.tar.gz \
 --exclude *.orig.tar.gz --exclude *.diff.gz --exclude *.dsc \
"
# If there are items required for i386 missing from that list I'm sure
someone will say.
# the --exclude testing/ etc just get rid of the disk images Contents.gz and
Packages*
# files for those releases. The bulk of the files are in pool/, and unless
you are really keeping
# unstable and testing out of pool/ I think you may as well mirror the dists
directory
# for them.
# - jla





Reply to: