[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Splitting /Lists-Archives/



Hi,

It has been a thorn in my eye for a long time.  The index listing for
/Lists-Archives/ is nearly 70kB big.  In Germany it takes quite a while
downloading it.  Needless too say that it sucks deep if you only want
to look at one particular list, say debian-security-announce.

Thus I've just hacked together a small tool that takes the existing
index.html file that Guy's script generates and dumps lists.html as
modified index with links to sub indices and $foo.html where $foo
is the name of the list.  I'll attach the script.

If this script would be called after Guy's script we'll benefit from

http://www.debian.org/Lists-Archives/debian-security-announce.html
as link which is only about 4kb long, instead of 70kB.

Could you please add this or at least consider a similar tool?

Regards,

	Joey

-- 
Linux - the choice of a GNU generation.

Please always Cc to me when replying to me on the lists.
#! /usr/bin/perl

#  Test environment for Joey
#  $config{'indexin'} = "/tmp/work/list/master/index.html";
#  $config{'indexout'} = "/tmp/work/list/index.html";
#  $config{'indexdir'} = "/tmp/work/list";

$config{'indexin'} = "/debian2/web/debian.org/Lists-Archives/index.html";
$config{'indexout'} = "/debian2/web/debian.org/Lists-Archives/lists.html";
$config{'indexdir'} = "/debian2/web/debian.org/Lists-Archives";

$header = $footer = '';

sub get_head_foot
{
    my $text;

    if (open (F, $config{'indexin'})) {
	while (<F>) {
	    $text .= $_;
	    if (!$header && /^<H1>Mailing List Archives<\/H1>$/) {
		$header = $text;
		$text = '';
	    } elsif (!$footer && /^<P><SMALL>Template Last Modified/) {
		$text = $_;
	    }
	}
	close (f);
	$footer = $text;
    }
}

sub modify_header
{
    my $list = shift;
    my $header = shift;

    $header =~ s!<TITLE>Debian GNU/Linux: Mailing List Archives</TITLE>!<TITLE>Debian GNU/Linux: Mailing List "$list"</TITLE>!;
    $header =~ s!<H1>Mailing List Archives</H1>!<H1>Mailing List "$list"</H1>!;

    return $header;
}

sub process_index
{
    my $text;
    my $fname;
    my $line;

    open (F, $config{'indexin'}) || die "Can't open $config{'indexin'}";
    open (O, ">$config{'indexout'}") || die "Can't open $config{'indexout'}";
    open (L, ">/dev/null") || die "Can't open /dev/null";

    while (<F>) {
	# <li><strong><a name="debian-announce">debian-announce</a></strong>: Important announcements<ul>
	if (/^<li><strong><a name=\"([^\"]*)\"/) {
	    $line = $_;
	    $fname = $1;
	    $text = $line;
	    $text =~ s!a name!a href!;
	    $text =~ s!\">!.html\">!;
#	    $text =~ s/<((li)|(ul))>//;
#	    $text =~ s/<li>//;
	    $text =~ s/<ul>/<br>/;
	    print O $text;
	    print L $footer;
	    close (L);
	    open (L, ">$config{'indexdir'}/$fname.html") || die "Can't open $config{'indexdir'}/$fname.html";
	    print L modify_header ($fname, $header);
	    $text = $line;
	    $text =~ s!<li><strong>.*</strong>: !!;
	    $text =~ s!<ul>!</h3><ul>!;
	    print L "<h3>" . $text;
	    while (<F> !~ /^<\/ul><\/li>$/) {
		print L;
	    }
	} elsif (/^<h3>.*<\/h3>$/) {
	    print O $_;
	} else {
	    print O $_;
	}
    }
    close (F);
}

get_head_foot();

process_index();

Reply to: