[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Promoting your website with bulk-email



* Marcelo E. Magallon (mmagallo@debian.org) [20021015 07:25]:

> My problems with bogofilter come from the fact that most of the
> mail I get is written in English, with a small percentage (less
> than 5%, I'd dare say, in Spanish and German) and most of the
> SPAM I get nowadays is written in English, German and *gasp*
> Chinese (or Korean, or whatever).  And some in Spanish, too.

I have similar experience, and additionally bogofilter data files
(BerkeleyDB) are just *huge*.  I've trained it with a corpus of
25000 spam messages and 20000 non-spam messages, then I gave it a
test run on 200 previously unseen messages.  63% success rate, far
too low, and the databases exceeded 10M.  SpamAssassin had 99.5%
percent success rate, but it is ridiculously slow compared to
Bayesian filters.

Peter

-- 
    .+'''+.         .+'''+.         .+'''+.         .+'''+.         .+''
 Kelemen Péter     /       \       /       \       /    fuji@debian.org
.+'         `+...+'         `+...+'         `+...+'         `+...+'



Reply to: