[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: baysian filtering (was: Re: Massive increase of spam on debian-*@l.d.o)



On Wed, May 05, 2004 at 01:12:51PM -0000, Monique Y. Mudama wrote:
> Anyway, I dutifully pipe them through sa-learn, but I worry.  If these
> spams look so much like regular mail, won't I just end up tainting my
> baysian library by teaching sa-learn with them?  I mean, eventually,
> won't my baysian scheme be unable to distinguish between spam and ham?

If a set of words appears in both spam and ham, the filter will begin to
think they're not relevant, and will focus on what helps telling the
difference (so to speak).

By the way, I used bogofilter and SA before, and now switched to CRM114,
which learns amazingly fast.

J.



Reply to: