[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Archive Restructuring - Package Pool



How to fool a mirror client into selectively flattening/dropping parts of
the archive, without having to hack the client or server: use existing
configuration option on the client to read the directory listing from
a file, then feed it a munged file.

Below is a prototype script to do that munging.  I suspect it'll go very
slowly on large archives; someone who knows perl better than I could
probably tack some go-faster stripes on, parameterise it, and generally
make a better go of it than I have.  If you can, please feel free.

Rob Browning:
[about multiple ls-lR files kept on the server]
>I don't think this is the right approach.  It limits the retrieving
>sides flexibility too much, and it doesn't scale *at all*.  We have to
>maintain a combinatorial number of files to cover all the possible
>sets of keywords that someone could specify.
>
>  non-us
>  non-us ^ violent
>  non-us ^ french-only
>  french-only
>  french-only ^ violent
>  etc...

Yes, I see what you mean.  However, mirror can also take its ls-lR list
from a local file (e.g. -klocal_ls_lR_file=ls-lR.french-only+noviolent).
We could generate and distribute just a few of the most commonly used
configurations, and distribute a script (perhaps based on the one below)
to help archive maintainers generate tailored ls-lR files for themselves.

Cheers,

-- 
Charles Briscoe-Smith
White pages entry, with PGP key: <URL:http://alethea.ukc.ac.uk/wp?95cpb4>
PGP public keyprint: 74 68 AB 2E 1C 60 22 94  B8 21 2D 01 DE 66 13 E2



#! /usr/bin/perl
#
# Usage: ./munge-ls-lR ls-lR >ls-lR-munged
# or     ls -lRA | ./munge-ls-lR >ls-lR-munged

# Configure these definitions as necessary:

$flattenre="slink";
$dropre="hamm|package-pool|scripts";

# First, any symlink whose name matches $flattenre is flattened.  Next,
# any symlink whose destination, after being regularised and cleaned up,
# matches $dropre, is flattened.  Lastly, any normal file, symlink or
# directory whose name matches $dropre is discarded.
#
# This script doesn't touch the filesystem -- it only twiddles the
# contents of the ls-lR index file to fool the FTP mirror client into
# doing our evil will.

$curdir="";
@dirs=("");

while (<>) {
	chomp;
	next if m/^$/;
	next if m/^total \d+$/;
	if (m/^([^ ]*):$/) {
		$curdir="$1/";
		push @dirs, $1;
		next;
	}
	unless (m/^(.*\d) (.*)$/) {
		die "Unrecognised line: `$_'\n"
	}
	undef $symdest;
	$name=$2;
	$info=$1;
	if ($name =~ m/(.*) \-\> (.*)/) {
		$name=$1;
		$symdest=$2;
	}
	if (defined $info{"$curdir$name"}) {
		die "Duplicate file\n"
	}
	$info{"$curdir$name"}=$info;
	$symdest{"$curdir$name"}=$symdest
		if (defined $symdest);
}

foreach $name (keys %info) {
	if (defined $symdest{$name}) {
		$_=$name;
		s=/([^/]*)$=/$symdest{$name}=;
		while (s=/\.?/=/=)
			{ }
		while (s=(^|/)([^./]|[^./][^/]|[^/][^./]|[^/][^/][^/]+)/\.\./=$1=)
			{ }
		$symdest=$_;
	}
	if ($name =~ m/$flattenre/o || $symdest =~ m/$dropre/o) {
		if (defined $symdest{$name}) {
			if (not defined $info{$symdest}) {
				warn "Not flattening dangling symlink $name\n";
			} else {
				warn "Flattening symlink $name -> $symdest\n";
				$info{$name}=$info{$symdest};
				undef $symdest{$name};
			}
		}
	}
}

foreach $name (keys %info) {
	$_=$name;
	if (m/$dropre/o) {
		warn "Dropping $name\n";
		delete $info{$name};
		delete $symdest{$name};
	}
}

foreach $dir (@dirs) {
	next if $dir =~ m/$dropre/o;
	if ($dir) {
		print "\n$dir:\n";
	} else {
		foreach $name (sort keys %info) {
			if ($name =~ m=^[^/]+$=) {
				print "$info{$name} $name";
				if (defined $symdest{$name}) {
					print " -> $symdest{$name}";
				}
				print "\n";
			}
		}
	}

	# This is the slow bit; we should keep the entries grouped by
	# directory right from the start, so we don't have to do this
	# O(N^2) (or worse) stuff here...  Can Perl spot and optimise
	# the loop invariant "(sort keys %info)" here?
	foreach $name (sort keys %info) {
		if ($name =~ m=^$dir/([^/]+)$=) {
			print "$info{$name} $1";
			if (defined $symdest{$name}) {
				print " -> $symdest{$name}";
			}
			print "\n";
		}
	}
}


--  
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: