Re: Attempts to poison bayesian systems
Noah L. Meyerhans wrote:
This method is especially effective in the case where the bayesian
classifier only looks at the first MIME attachment, because the second
is then free to contain whatever spam tokens they want to put in it.
IIRC, this is how most bayesian filters behave.
I got such an email today. The first MIME attachment contained nothing
but nonsense (random characters that looked like words, like: jkjkjasd
hjhewwoah asfjhw shwl asjdkfjo, ajjdfkjdf owerkjadf.)
In the second MIME attachment, there was the spam message. Interestingly
the mail was marked by spamassassin as no spam. Another trick they do
could be to make mails larger than 64 kByte, as the standard setup of sa
does not look at mails bigger than this (well, no, it's the standard
setup of amavisd-new for SA...).
Altough the mail did not look correct in Mozilla, I guess it would have
looked fine in OE. I fear that spammers are going a step further and try
to trick systems like spamassassin. If they do this in a clever way,
they whole spam thing will become even more cumbersome...