Re: Favorite anti-spam tool

To: debian-user@lists.debian.org
Cc: Christian Jaeger <christian.jaeger@ethlife.ethz.ch>, nate <debian-user@aphroland.org>
Subject: Re: Favorite anti-spam tool
From: Colin Watson <cjwatson@debian.org>
Date: Thu, 1 May 2003 13:48:59 +0100
Message-id: <[🔎] 20030501124859.GC23043@riva.ucam.org>
Mail-followup-to: debian-user@lists.debian.org, Christian Jaeger <christian.jaeger@ethlife.ethz.ch>, nate <debian-user@aphroland.org>
In-reply-to: <20030501012120.GA25871@merlin.sccs.swarthmore.edu>
References: <20030429211638.GA8774@mindspring.com> <20030430062310.GK32368@ursine.dyndns.org> <60364.10.10.10.7.1051686042.squirrel@webmail.linuxpowered.net> <p04320419bad507e939a8@[192.168.3.11]> <20030501012120.GA25871@merlin.sccs.swarthmore.edu>

On Wed, Apr 30, 2003 at 09:21:20PM -0400, Nori Heikkinen wrote:
> Cool, I just upgraded to 2.53, and it seems better.  Can someone
> explain to me how a Bayesian filter would work in a static context?
> Does it train itself based on the threshold you provide?  My
> impression (though this could be wrong, as I've only ever used
> spamassassin) is that on other commonly-used Bayesian spamfilters, you
> have to manually train it on x number of emails before it learns what
> you consider spam and what you don't.  How does spamassassin -- which
> is procmail-based -- train itself?

You can train it by hand using sa-learn. If you don't, it auto-learns
based on its own scores: anything below auto_learn_threshold_nonspam
(default -2.0) gets auto-learned as ham, and anything above
auto_learn_threshold_spam (default 15.0) gets auto-learned as spam. You
see 'autolearn=ham' or 'autolearn=spam' in the headers when this
happens, so you can correct it with sa-learn if need be.

This sounded like a dodgy approach to me, but in practice it seems to
work well. Most of the mail that SpamAssassin classifies wrongly is
somewhere between -2.0 and 15.0 and so doesn't get auto-learned either
way. And you can always turn off auto-learning if you really don't like
it.

-- 
Colin Watson                                  [cjwatson@flatline.org.uk]

Reply to:

Follow-Ups:
- Re: Favorite anti-spam tool
  - From: Nori Heikkinen <nori@sccs.swarthmore.edu>
- Re: Favorite anti-spam tool
  - From: Nori Heikkinen <nori@sccs.swarthmore.edu>

Prev by Date: Re: rescue a partition...ok, im freakin out a little
Next by Date: Re: Is this why you shouldn't log in as root?
Previous by thread: Re: vga in lilo
Next by thread: Re: Favorite anti-spam tool
Index(es):
- Date
- Thread