On 2004-05-05, William Ballard penned:
On Tue, May 04, 2004 at 11:49:20PM -0700, William Ballard wrote:
filter the spam? and do Bayesian filtering myself on mail I get from
the lists? I think I read that so far only Bayesian filtering works
and
Speaking of Baysian. Lately I've been getting a lot of spam (to me
directly, not to the list) with a long-winded joke in the body. The
email has a spammy subject and a few bizarre keywords in it, but mostly
it's this long-winded joke, with well-formed grammar, proper spelling,
etc. It's even plain text; no HTML involved. I've even caught myself
reading a few of them, then feeling rather dirty.
Anyway, I dutifully pipe them through sa-learn, but I worry. If these
spams look so much like regular mail, won't I just end up tainting my
baysian library by teaching sa-learn with them? I mean, eventually,
won't my baysian scheme be unable to distinguish between spam and ham?