Re: Attempts to poison bayesian systems

To: <debian-isp@lists.debian.org>
Cc: <debian-security@lists.debian.org>
Subject: Re: Attempts to poison bayesian systems
From: "Jason Lim" <maillist@jasonlim.com>
Date: Tue, 23 Dec 2003 22:22:25 +0800
Message-id: <[🔎] 55d001c3c960$306eff90$de00a8c0@tw1>
Reply-to: "Jason Lim" <maillist@jasonlim.com>
References: <[🔎] 20031223132530.GA9089@vnl.com> <[🔎] 200312240052.45792.russell@coker.com.au>

> One technique that's being used a lot is to get books in electronic form
and
> put a coupld of sentences in every spam (sentences from a book will pass
> gramatical checking etc, unlike the example you posted above).  Also
text
> from a book will have the right ratio of words, you will almost never
find
> such a long "sentence" in an email message without a punctuation
character,
> "and", "or", or other common words except in the case of source code
(which
> is another category in bayesian filters).

That won't work very well with Spamassassin, as it doesn't rely on
bayesian filtering alone, and also uses header check and dnsbl checks. So
you are correct... it does lower the bayesian score with these "random
legitimate" sentences, but doesn't get them through completely unless you
are using something like popfilter or such that only have bayesian
filtering. And also note they can't only have these sentences in their
emails... they must still have the "catch line" like "increase pen1s size"
or something like that, and the bayesian filter will, over time, learn
that all the other words are not as important as "pen1s" and these other
words. So eventually it will work... at least that's my understanding of
it. Feel free to improve or correct the above.

Reply to:

References:
- Attempts to poison bayesian systems
  - From: Dale Amon <amon@vnl.com>
- Re: Attempts to poison bayesian systems
  - From: Russell Coker <russell@coker.com.au>

Prev by Date: Re: GnuPG & mutt on Woody 3.0r2.
Next by Date: Re: Attempts to poison bayesian systems
Previous by thread: Re: Attempts to poison bayesian systems
Next by thread: Re: Attempts to poison bayesian systems
Index(es):
- Date
- Thread