Re: Spam in the lists out of control

On Mon, May 10, 2004 at 02:27:20PM +0200, Marek Habersack wrote:
> Most of my spam comes from the debian lists, so I would say it is similar
> enough to the traffic down here.

You have to deal with emails in different languages as well.

> > For what it's worth, empirical evidence indicates that SpamAssassin's
> > Bayesian database is difficult to poison, since it's difficult for
> > spammers to pick words that are learned as non-spammy (since everyone
> > has their own set of non-spammy words). But, since lists.debian.org
> > doesn't use bayes, this point is moot.
> I don't understand why is SpamAssassin thought to be the only option? SA is
> a CPU/memory hog, it can easily kill even a fairly powerful machine and
> there _are_ alternatives to it. One thing to use could be dspam, as I
> pointed at in the other post, another (which also uses language
> classification and
> is already packaged for debian) would be crm114 and then there is a whole
> host of bayesian filter programs that are written in a language suited for
> heavy-duty tasks (C, that is :>). Both dspam and crm114 boast over 99%
> accuracy in spotting spam, now that would be really neat if we had that
> level of protection around here.

We're already close to 99% accuracy. We want more.

If you want to figure it out exactly head over to


and start counting the spams. 1400 emails didn't make it on the list
this month.

Personally, I'd rather have some spam make it onto the list than block
any valid emails. I still believe we can do better than what we're
currently doing however.



