About spam in the list archive
it has been claimed that the Debian list archives contain spam email
There is a "report as spam" button in on the list archive page of each
message, but presently, spam is by and large not removed from the
archives. The submissions seem to help (more or less) with finding spam
but need manual review before they could be acted upon.
I would like to put forward the following ideas and opinions towards
systematic spam removals.
Messages that are (beyond doubt) spam should be removed from the web
archives. They should remain in the mailbox archives (and thus be
accessible to developers on master.d.o).
Spam removals should be very conservative, with any doubt meaning no
removal. In systematic removals, candidates need to be checked multiple
times in order to minimize the risk of unmerited removal.
The information which messages have been flagged junk and how that came
to be (review logs) should be accessible along with the mailbox
archives, so any developer can inspect the changes to the archive and
complain to listmaster about removals.
On the technical side, when removing messages from the list archives
URIs of messages must not change. Some experiments have been started
in July. More tests are currently carried out.
Comments and suggestions are very welcome. As I would like to use these
ideas as a starting point to develop and implement a spam removal
policy, I also encourage you to voice concerns that you may have.
Calls for help first for testing the necessary tools and later reviewing
submissions will be put out at a later point of time. Also note that all
this is about the archive spam policy only.
P.S.: Thanks to those offering opinions on the subject and comments to a
draft of these ideas, in particular the listmaster team and Frans Pop.
The bad ideas are my own, though.
Thomas Viehmann, http://thomas.viehmann.net/