lists.d.o Spam (was: Marking BTS spam)

Hallo! Du (Nico Golde) hast geschrieben:

>At the moment the spam-report.pl script uses:
><input type="hidden" name="listname" value="debian-devel" />
><input type="hidden" name="msg" value="msg00065.html" />
><input type="hidden" name="date" value="2005/09" />

>To identify the message, this wouldn't work with a MUA so the idea came
>to my mind was to identify the Mail with the message-ID.
>Paskal Hakim asked what happens if someone fakes the message-ID in the old thread
>about this topic. Well this could happen, so someone has another idea?

i took the logfile from the reporting script and checked some
'nominated' Files.

As it turns out the top scorer for this year is:


on all logged data two non-spam postings on d-spanish-users were also
top20 nominated.

So it looks like a not so good idea to automatically use that data,
it can be used as fingerpoint where to look but not more.

We have two uses for that reporting.

1. Removing identified spam from the Archive.
2. Enhancing the filters.

on 2: we would need a whole spam-mail including all headers so we can
	find charakteristika to filter on, so i now take all nominated
	postings, and try to find patterns with some black

The idea of having a reporting address is good, i'll will announce one
in the next days and then lets see what happens.

However: there will be some really black .procmail-magic that pulls
out everything that doesn't look appropriate (multi nomination of a
posting, uncomplete headers and stuff). More info when it is in place.


