Re: Spamassassin
On Wednesday 08 October 2003 08:50, Stephan Seitz wrote:
> Is there a download possibility to get so much spam mails? Since I
> delete my spam, I don't have enough mails to train spamassassin.
It is, but I won't tell you because it wouldn't do you any good! :-)
For the Bayesian filter to be accurate, it has to be your spam and your
ham. True, everybody's spam is quite similar, but there are
differences, and it is not unlikely that it would make very skewed
statistics, and would be of little use.
Another point to emphasize is that it is equally important to train it
with your own ham as it is to train it with your own spam, and in
roughly equal numbers.
So, I would strongly recommend you just manually save the spam to a
folder for some time now, and build your spam database from there.
Basically, what I did to train it, was to take about 2000 old spams I
had gather a long time ago, when I was actively whacking spammers, then
used 1000 recent spams from my old account (that took me just a week to
gather ! :-( ). Then, on the top of that, I fed it with 250 spams from
my new account. Nowadays, I feed it mainly with false negatives.
The problem with this approach, is that the character of spam has
changed substantially since I gathered those 2000, but it works
apparently quite good for me anyway.
Finally, I have a few spamtraps (hehe, spambots, do your worst:
href="mailto:aa0u@kjernsmo.net"), which I intend to use to train it
automatically, once I've got my Exim4 server configured right!
Then, I took all the legitimate mail from the saved folders at my old
account, and fed it to the learner as ham. Then, I took a lot of recent
list mail and fed it too. Nowadays I reguralily supply it with most of
that which lands in my inbox, that is, mail that is directed to me
personally.
It has made the Bayesian filter very accurate, but it has, as you can
tell, taken a lot of effort.
Cheers,
Kjetil
--
Kjetil Kjernsmo
Astrophysicist/IT Consultant/Skeptic/Ski-orienteer/Orienteer/Mountaineer
kjetil@kjernsmo.net webmaster@skepsis.no editor@learn-orienteering.org
Homepage: http://www.kjetil.kjernsmo.net/ OpenPGP KeyID: 6A6A0BBC
Reply to: