[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

lists.d.o Spam (was: Marking BTS spam)



Hallo! Du (Nico Golde) hast geschrieben:

>At the moment the spam-report.pl script uses:
><input type="hidden" name="listname" value="debian-devel" />
><input type="hidden" name="msg" value="msg00065.html" />
><input type="hidden" name="date" value="2005/09" />

>To identify the message, this wouldn't work with a MUA so the idea came
>to my mind was to identify the Mail with the message-ID.
>Paskal Hakim asked what happens if someone fakes the message-ID in the old thread
>about this topic. Well this could happen, so someone has another idea?

i took the logfile from the reporting script and checked some
'nominated' Files.

As it turns out the top scorer for this year is:

http://lists.debian.org/debian-project/2006/01/msg00035.html

on all logged data two non-spam postings on d-spanish-users were also
top20 nominated.

So it looks like a not so good idea to automatically use that data,
it can be used as fingerpoint where to look but not more.


We have two uses for that reporting.

1. Removing identified spam from the Archive.
2. Enhancing the filters.

on 2: we would need a whole spam-mail including all headers so we can
	find charakteristika to filter on, so i now take all nominated
	postings, and try to find patterns with some black
	shell-magic.

The idea of having a reporting address is good, i'll will announce one
in the next days and then lets see what happens.

However: there will be some really black .procmail-magic that pulls
out everything that doesn't look appropriate (multi nomination of a
posting, uncomplete headers and stuff). More info when it is in place.

Cord
-- 
http://lists.debian.org

Attachment: signature.asc
Description: Digital signature


Reply to: