Re: Debian mailing lists archives as mbox (was: Re: [Soc-coordination] Debian Teams Activity Metrics - Report IV) [Update]
Andreas Tille schrieb am Thursday, den 04. August 2011:
> On Thu, Aug 04, 2011 at 09:44:49AM +0200, Alexander Wirt wrote:
> > We had an ongoing discussion about privacy and so
> > spam and so on about the mboxes. We even managed to get consense yesterday.
> To bring some light into this I would like to publish this consense we
> A filter needs to be written (most probably this will be done
> by Sukhbir who should test this on any mbox because he is not
> allowed to access original mboxes). The filter should have the
> following features:
> - Parse the existing mboxes and strip them down to the following
> Message-id: <ID>
> From: Name of poster <firstname.lastname@example.org>
> Date: Date
> Subject: Subject
> - Remove those Message-IDs which should be removed (just
> detected SPAM)
> - Publish these mboxes (it was not yet specified by listmaster
> whether for general http download or only for specific users)
Just for the record. The mboxes are not for being published. We are currently
working on getting more data privacy protection in the archive so just
publishing the mboxes would just be counterproductive.
Ok, after thinking about it you can include In-reply-to and References. I
don't see why X-Spam should be useful for you.