Re: Attempts to poison bayesian systems

To: debian-security@lists.debian.org
Subject: Re: Attempts to poison bayesian systems
From: "Noah L. Meyerhans" <noahm@debian.org>
Date: Tue, 23 Dec 2003 12:00:43 -0500
Message-id: <[🔎] 20031223170042.GJ1357@morgul.net>
Mail-followup-to: "Noah L. Meyerhans" <noahm@debian.org>, debian-security@lists.debian.org
In-reply-to: <[🔎] 20031223133620.GB9089@vnl.com>
References: <[🔎] 20031223132530.GA9089@vnl.com> <[🔎] 87wu8nu0bc.fsf@killer.killeri.net> <[🔎] 20031223133620.GB9089@vnl.com>

On Tue, Dec 23, 2003 at 01:36:20PM +0000, Dale Amon wrote:
> > I have yet to see a false positive caused by this even though I get
> > quite a lot of this stuff and routinely mark it as spam.
> 
> I can't think of any other reason for someone to do it
> though. There has to be a point. Someone is going to a 
> lot of trouble.

Could it be the case that they're using all these non-spam words to
generate false-negatives, thus bypassing bayesian filters?  I've seen
lots of these messages get through spamassassin in the past week or so,
all with very low bayes scores.  Training the bayesian classifier with
these messages is obviously not going to do me much good, because the
next spam is going to have a completely different set of tokens.

This method is especially effective in the case where the bayesian
classifier only looks at the first MIME attachment, because the second
is then free to contain whatever spam tokens they want to put in it.
IIRC, this is how most bayesian filters behave.

noah

Attachment: pgp4Yt39ezz2I.pgp
Description: PGP signature

Reply to:

Follow-Ups:
- Re: Attempts to poison bayesian systems
  - From: Marcel Weber <mmweber@ncpro.com>
- Re: Attempts to poison bayesian systems
  - From: Simon Huggins <huggie@earth.li>

References:
- Attempts to poison bayesian systems
  - From: Dale Amon <amon@vnl.com>
- Re: Attempts to poison bayesian systems
  - From: Kalle Kivimaa <killer@debian.org>
- Re: Attempts to poison bayesian systems
  - From: Dale Amon <amon@vnl.com>

Prev by Date: Re: Attempts to poison bayesian systems
Next by Date: suspicious smbd connections
Previous by thread: Re: Attempts to poison bayesian systems
Next by thread: Re: Attempts to poison bayesian systems
Index(es):
- Date
- Thread