Re: [RFR] templates://crm114/{crm114.templates}
Milan Zamazal wrote:
> Thank you, Christian, for your suggestions. I agree with your proposed
> changes with the following exceptions:
>
>>>>>> "CP" == Christian Perrier <bubulle@debian.org> writes:
>
> CP> +Description: versatile filtering system for email and other
> CP> data
>
> crm114 is not a filtering system, it's a classifying system.
Fair enough; so "versatile classifying system for e-mail and other
data", or possibly "versatile classifier for e-mail and other data"?
> CP> - Accuracy of the SBPH/BCR classifier has been seen in excess of 99 per cent,
> CP> - for 1/4 megabyte of learning text. In other words, CRM114 learns, and it
> CP> - learns fast.
>
> CP> The last sentences are a little bit too close to "advertisement" as
> CP> discouraged by the DevRef. Neutral language is the key, here.
>
> I agree the wording could be improved. But the information that crm114
> is accurate and that it learns fast is true and it is important for the
> user (this is the most important reason I use crm114 and not another
> classifier after all), so it shouldn't be removed.
Surely nobody would set out to choose an inaccurate, slow-learning
spamfilter, but popcon tells me crm114 has only a few dozen active
users, compared to thousands using bogofilter. I see papers online
comparing the various algorithms used; there's even a page written
in 2002 by a bogofilter developer trying out a version that adopts
the CRM114 algorithm... but this didn't go anywhere. Why? Does
CRM114 have disadvantages - like being harder to set up, slower
processing individual messages, more resource-heavy? Or is it
unjustly overlooked?
--
JBR with qualifications in linguistics, experience as a Debian
sysadmin, and probably no clue about this particular package
Reply to: