[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Splitting /Lists-Archives/


It has been a thorn in my eye for a long time.  The index listing for
/Lists-Archives/ is nearly 70kB big.  In Germany it takes quite a while
downloading it.  Needless too say that it sucks deep if you only want
to look at one particular list, say debian-security-announce.

Thus I've just hacked together a small tool that takes the existing
index.html file that Guy's script generates and dumps lists.html as
modified index with links to sub indices and $foo.html where $foo
is the name of the list.  I'll attach the script.

If this script would be called after Guy's script we'll benefit from

as link which is only about 4kb long, instead of 70kB.

Could you please add this or at least consider a similar tool?



Linux - the choice of a GNU generation.

Please always Cc to me when replying to me on the lists.
#! /usr/bin/perl

#  Test environment for Joey
#  $config{'indexin'} = "/tmp/work/list/master/index.html";
#  $config{'indexout'} = "/tmp/work/list/index.html";
#  $config{'indexdir'} = "/tmp/work/list";

$config{'indexin'} = "/debian2/web/debian.org/Lists-Archives/index.html";
$config{'indexout'} = "/debian2/web/debian.org/Lists-Archives/lists.html";
$config{'indexdir'} = "/debian2/web/debian.org/Lists-Archives";

$header = $footer = '';

sub get_head_foot
    my $text;

    if (open (F, $config{'indexin'})) {
	while (<F>) {
	    $text .= $_;
	    if (!$header && /^<H1>Mailing List Archives<\/H1>$/) {
		$header = $text;
		$text = '';
	    } elsif (!$footer && /^<P><SMALL>Template Last Modified/) {
		$text = $_;
	close (f);
	$footer = $text;

sub modify_header
    my $list = shift;
    my $header = shift;

    $header =~ s!<TITLE>Debian GNU/Linux: Mailing List Archives</TITLE>!<TITLE>Debian GNU/Linux: Mailing List "$list"</TITLE>!;
    $header =~ s!<H1>Mailing List Archives</H1>!<H1>Mailing List "$list"</H1>!;

    return $header;

sub process_index
    my $text;
    my $fname;
    my $line;

    open (F, $config{'indexin'}) || die "Can't open $config{'indexin'}";
    open (O, ">$config{'indexout'}") || die "Can't open $config{'indexout'}";
    open (L, ">/dev/null") || die "Can't open /dev/null";

    while (<F>) {
	# <li><strong><a name="debian-announce">debian-announce</a></strong>: Important announcements<ul>
	if (/^<li><strong><a name=\"([^\"]*)\"/) {
	    $line = $_;
	    $fname = $1;
	    $text = $line;
	    $text =~ s!a name!a href!;
	    $text =~ s!\">!.html\">!;
#	    $text =~ s/<((li)|(ul))>//;
#	    $text =~ s/<li>//;
	    $text =~ s/<ul>/<br>/;
	    print O $text;
	    print L $footer;
	    close (L);
	    open (L, ">$config{'indexdir'}/$fname.html") || die "Can't open $config{'indexdir'}/$fname.html";
	    print L modify_header ($fname, $header);
	    $text = $line;
	    $text =~ s!<li><strong>.*</strong>: !!;
	    $text =~ s!<ul>!</h3><ul>!;
	    print L "<h3>" . $text;
	    while (<F> !~ /^<\/ul><\/li>$/) {
		print L;
	} elsif (/^<h3>.*<\/h3>$/) {
	    print O $_;
	} else {
	    print O $_;
    close (F);



Reply to: