[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Old BTS snapshot



Richard Braakman wrote:
> I put it up for HTTP at http://people.debian.org/~dark/Bugs.tgz
> 
> The scripts might be tricky, since they have to convert back from
> web page output to the original bugreports and mails.

So I have a script that generates .report and .status files that seem to
be valid. I still need to make .log files, which will be .. interesting.
Crazy, crazy format.

I also need to make it look at the existing bts, and skip bugs that are
_still_ open after all these years. And make it set any remaining bugs from
the old snapshot as all closed, with a note saying something like "This
bug was fixed sometime between 1997 and 2000; data since 1997 was lost.".
Then after some more checking they should be ready to drop into archive/
in the BTS.

-- 
see shy jo, who is not presenting the below as an exacmple of good code


#!/usr/bin/perl
use Date::Parse;
use HTML::Entities;

my $archivedir=shift;
mkdir $archivedir;

foreach my $file (@ARGV) {
	my %data=();
	open IN, "<$file" or die "$file: $!";
	$/=undef;
	$_=<IN>;
	close IN;
	
	my @lines=split(/\n/, $_);

	my ($num)=$file=~m/(\d+)\.html/;
	my ($dir)=$num=~m/(\d\d)$/;
	$dir="$archivedir/$dir";
	mkdir $dir;
	print "$file is $num, save to $dir\n";
	
	# TODO: do more formatting to the log.
	open OUT, ">$dir/$num.log" or die "$dir/$num.log: $!";
	print OUT "\n$_\03";
	close OUT;
	
	if (m!<strong>Done:</strong> (.*?);\s!s) {
		$data{done}=decode_entities($1);
	}
	else {
		$data{done}='';
	}
	if (/Maintainer for ([^ ]*) is/) {
		$data{package}=decode_entities($1);
	}
	else {
		# unknown package
		if (m!</h1>Reported by:!) {
			$data{package}='unknown';
		}
	}
	
	if (/Reported by: (.*?);\s/s) {
		$data{originator}=decode_entities($1);
	}
	if (m!<strong>Forwarded</strong> to (.*?);\s!) {
		$data{forwarded}=decode_entities($1);
	}
	else {
		$data{forwarded}='';
	}
	if (/merged with (.*?);/s) {
		my $raw=$1;
		my @nums=();
		foreach my $i (split(/\n/, $raw)) {
			if (m/>#(\d+)</) {
				push @nums, $1;
			}
		}
		$data{mergedwith}=join(' ', @nums);
	}
	else {
		$data{mergedwith}='';
	}
	if (/dated (.*?);\s/s) {
		$data{date}=str2time(decode_entities($1));
	}
	else {
                # Hard way.
                foreach my $line (reverse @lines) {
                        if ($line =~ /Date:\s*(.*)/i) {
                                $data{date}=str2time(decode_entities($1));
                                last;
                        }
                        last if $line =~ /<h2>Message received/;
                }
	}
	if ($lines[3] =~ m!(.*?)</h1>Package:!) {
		$data{subject}=decode_entities($1);
	}
	else {
		# Hard way.
		foreach my $line (reverse @lines) {
			if ($line =~ /Subject:\s*(.*)/i) {
				$data{subject}=decode_entities($1);
				last;
			}
			last if $line =~ /<h2>Message received/;
		}
	}
	foreach my $line (reverse @lines) {
		if ($line =~ /Message-Id:\s*(.*)/i) {
			$data{msgid}=decode_entities($1);
			last;
		}
		last if $line =~ /<h2>Message received/;
	}
	$data{keywords}=''; # don't think we had this back then.
	$data{severity}='normal'; # or this

	open (OUT, ">$dir/$num.status") or die "$dir/$num.status: $!";
	foreach my $field (qw{originator date subject msgid package keywords
			      done forwarded mergedwith severity}) {
		if (! exists $data{$field}) {
			print "!$file $field was not found\n";
		}
		print OUT "$data{$field}\n";
		print "\t$field = $data{$field}\n";
	}
	close OUT;
	
	my @report=();
	my $take=0;
	foreach my $line (reverse @lines) {
		if ($line eq '</pre>') {
			$take=1;
		}
		elsif ($line eq '<pre>') {
			last;
		}
		elsif ($take) {
			unshift @report, $line;
		}
	}
	if (! @report) {
		print "!$file report not found!\n";
	}
	open (OUT, ">$dir/$num.report") or die "$dir/$num.report: $!";
	print OUT join("\n", @report);
	close OUT;
}



Reply to: