[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Spam in mail archive



Thomas Viehmann wrote:
> If the one thing keeping us from deleting list spam (which I found out
...we don't...
> after reporting entire months of d-devel spam) is the indexing and thus
> linking, I'd happily try to come up with a patch that makes mhonarc deal
> with the gaps.

The result of a quick test that seems to be going in the right direction
is at[1], the files taken from l.d.o being at [2].

- seems to work at least when using "updatemail -f"
- fairly straightforward (diff is about two dozen lines in my rusty
  perl)
- uses a file of the format as the spam report log in order to read a
  list of spam messages to skip when constructing web indices
  (probably leaves room for efficiency improvements)
- does not change the mbox files, only changes the generated indices
- no detection yet in updatemail whether the archive should be updated
  because the spam list changed

Kind regards

T.

1.
http://master.debian.org/~tviehmann/debian-devel/2007/debian-devel-200701/
2. master.d.o:~tviehmann/list-archive-spam/
  changed code in
     mhonarc/share/mhonarc/mhamain.pl
  and additional spam file
     lists/debian-devel/2007/debian-devel-200701-spam
-- 
Thomas Viehmann, http://thomas.viehmann.net/



Reply to: