[OFFTOPIC] how the Debian lists are archived

On Tue, Sep 09, 2003 at 08:06:27PM +0200, Josip Rodin wrote:
> On Mon, Sep 08, 2003 at 04:14:19AM -0500, Branden Robinson wrote:
> > [ UH OH.  I just noticed that the de-spamification of the mailing list
> > archives has caused some URLs from Google to point to the wrong
> > messages.  IMO this is deeply unfortunate.  :(
> Actually, usually it recrawls everyhing within a month (one regular update).

Ah, that's good to hear.

> One fine day, when we get our own proper search engine, there won't be such
> issues...

On IRC I proposed that mail archiving software use something better than
monotonically increasing integers to index the stored messages.  An MD5
checksum of the message body plus certain headers would be far superior.

(You could then store the messages in some arbitrary pool, and the list
archives would be made up of links into that pool, very similar to the
way the Debian archives actually work.  Crossposted messages would only
be stored once, for example.)

> Please note also that this a listarchives@d.o issue, not listmaster.

Noted.  Thanks!

G. Branden Robinson
Debian GNU/Linux                   |    belly laugh.
branden@debian.org                 |    -- Robert Heinlein
http://people.debian.org/~branden/ |

Reply to: