Re: Archive Restructuring - Package Pool
How to fool a mirror client into selectively flattening/dropping parts of
the archive, without having to hack the client or server: use existing
configuration option on the client to read the directory listing from
a file, then feed it a munged file.
Below is a prototype script to do that munging. I suspect it'll go very
slowly on large archives; someone who knows perl better than I could
probably tack some go-faster stripes on, parameterise it, and generally
make a better go of it than I have. If you can, please feel free.
Rob Browning:
[about multiple ls-lR files kept on the server]
>I don't think this is the right approach. It limits the retrieving
>sides flexibility too much, and it doesn't scale *at all*. We have to
>maintain a combinatorial number of files to cover all the possible
>sets of keywords that someone could specify.
>
> non-us
> non-us ^ violent
> non-us ^ french-only
> french-only
> french-only ^ violent
> etc...
Yes, I see what you mean. However, mirror can also take its ls-lR list
from a local file (e.g. -klocal_ls_lR_file=ls-lR.french-only+noviolent).
We could generate and distribute just a few of the most commonly used
configurations, and distribute a script (perhaps based on the one below)
to help archive maintainers generate tailored ls-lR files for themselves.
Cheers,
--
Charles Briscoe-Smith
White pages entry, with PGP key: <URL:http://alethea.ukc.ac.uk/wp?95cpb4>
PGP public keyprint: 74 68 AB 2E 1C 60 22 94 B8 21 2D 01 DE 66 13 E2
#! /usr/bin/perl
#
# Usage: ./munge-ls-lR ls-lR >ls-lR-munged
# or ls -lRA | ./munge-ls-lR >ls-lR-munged
# Configure these definitions as necessary:
$flattenre="slink";
$dropre="hamm|package-pool|scripts";
# First, any symlink whose name matches $flattenre is flattened. Next,
# any symlink whose destination, after being regularised and cleaned up,
# matches $dropre, is flattened. Lastly, any normal file, symlink or
# directory whose name matches $dropre is discarded.
#
# This script doesn't touch the filesystem -- it only twiddles the
# contents of the ls-lR index file to fool the FTP mirror client into
# doing our evil will.
$curdir="";
@dirs=("");
while (<>) {
chomp;
next if m/^$/;
next if m/^total \d+$/;
if (m/^([^ ]*):$/) {
$curdir="$1/";
push @dirs, $1;
next;
}
unless (m/^(.*\d) (.*)$/) {
die "Unrecognised line: `$_'\n"
}
undef $symdest;
$name=$2;
$info=$1;
if ($name =~ m/(.*) \-\> (.*)/) {
$name=$1;
$symdest=$2;
}
if (defined $info{"$curdir$name"}) {
die "Duplicate file\n"
}
$info{"$curdir$name"}=$info;
$symdest{"$curdir$name"}=$symdest
if (defined $symdest);
}
foreach $name (keys %info) {
if (defined $symdest{$name}) {
$_=$name;
s=/([^/]*)$=/$symdest{$name}=;
while (s=/\.?/=/=)
{ }
while (s=(^|/)([^./]|[^./][^/]|[^/][^./]|[^/][^/][^/]+)/\.\./=$1=)
{ }
$symdest=$_;
}
if ($name =~ m/$flattenre/o || $symdest =~ m/$dropre/o) {
if (defined $symdest{$name}) {
if (not defined $info{$symdest}) {
warn "Not flattening dangling symlink $name\n";
} else {
warn "Flattening symlink $name -> $symdest\n";
$info{$name}=$info{$symdest};
undef $symdest{$name};
}
}
}
}
foreach $name (keys %info) {
$_=$name;
if (m/$dropre/o) {
warn "Dropping $name\n";
delete $info{$name};
delete $symdest{$name};
}
}
foreach $dir (@dirs) {
next if $dir =~ m/$dropre/o;
if ($dir) {
print "\n$dir:\n";
} else {
foreach $name (sort keys %info) {
if ($name =~ m=^[^/]+$=) {
print "$info{$name} $name";
if (defined $symdest{$name}) {
print " -> $symdest{$name}";
}
print "\n";
}
}
}
# This is the slow bit; we should keep the entries grouped by
# directory right from the start, so we don't have to do this
# O(N^2) (or worse) stuff here... Can Perl spot and optimise
# the loop invariant "(sort keys %info)" here?
foreach $name (sort keys %info) {
if ($name =~ m=^$dir/([^/]+)$=) {
print "$info{$name} $1";
if (defined $symdest{$name}) {
print " -> $symdest{$name}";
}
print "\n";
}
}
}
--
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Reply to: