on Thu, May 06, 2004 at 07:42:14AM +0800, Katipo (katipo@weavers-web.org) wrote: > Karsten M. Self wrote: > > >The lists *are* filtered. It's just that some crap gets through. > > > >Peace. > I believe Pascal mentioned at one stage that only 1-2% gets through. For an idea of where percentages get you... Peter G. Neumann (PGN) is the long-time moderator of the comp.risks newsgroup and mailing list. This was established in the mid-1980s and continues through today. After rising tides of spam, he finally implemented SpamAssassin and some related checks a couple of years ago. This now filters out all but a percent or two of spam. The remaining mail to the list submission address is *still* on the order of 90% spam. There's the issue of: - How much spam do you filter. - What percentage of total mail is spam. In the case of comp.risks, the issue is that the relative volume of spam to signal swamps the list. PGN has since instituted other methods of dealing with the problem (largely requesting subject-line tags to indicate good content). Given his list's characteristics, I suspect he could get a lot of mileage from: - IP-based rejection of known spam sources. This can be highly effective, eliminating 50%+ of spam. My own results with ASN-based classification are mentioned at my homepage (see sig). - Whitelisting of known contributors (many items are submitted by a relatively small core group). - More aggressive handling of spam scores. Treating anything 5+ as spam, and bucketing ranges beneath that to likely or unlikely spam would give a smaller set of piles to go through. Since PGN manually moderates the list, not all of these methods are applicable to D-U. Peace. -- Karsten M. Self <kmself@ix.netcom.com> http://kmself.home.netcom.com/ What Part of "Gestalt" don't you understand? Support the EFF, they support you: http://www.eff.org/
Attachment:
signature.asc
Description: Digital signature