[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

baysian filtering (was: Re: Massive increase of spam on debian-*@l.d.o)



On 2004-05-05, William Ballard penned:
> On Tue, May 04, 2004 at 11:49:20PM -0700, William Ballard wrote:
>> filter the spam?  and do Bayesian filtering myself on mail I get from
>> the lists?  I think I read that so far only Bayesian filtering works
>> and 

Speaking of Baysian.  Lately I've been getting a lot of spam (to me
directly, not to the list) with a long-winded joke in the body.  The
email has a spammy subject and a few bizarre keywords in it, but mostly
it's this long-winded joke, with well-formed grammar, proper spelling,
etc.  It's even plain text; no HTML involved.  I've even caught myself
reading a few of them, then feeling rather dirty.

Anyway, I dutifully pipe them through sa-learn, but I worry.  If these
spams look so much like regular mail, won't I just end up tainting my
baysian library by teaching sa-learn with them?  I mean, eventually,
won't my baysian scheme be unable to distinguish between spam and ham?

Thoughts?

-- 
monique



Reply to: