[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Every spam is sacred

On Mon, Jun 16, 2003 at 12:20:20AM +1000, Martijn van Oosterhout wrote:
> On Sun, Jun 15, 2003 at 03:58:47PM +0200, Santiago Vila wrote:
> > For every potential false positive that you might miss because of
> > Debian eventually using the DSBL, I have to receive, download and
> > handle 1999 spam messages, which means 10Mb of spam (the average
> > spam message is a little more than 5K).

> How many emails does master handle in one day? Probably several thousand.
> That adds up to several false positives per day. Also, your estimate of 50%
> spam is too high IMHO. If you take a 70/30 split, you get one false positive
> per 900 emails. Finally, as pointed out earlier, only 40% of measured spam
> actually came from one of the listed servers. So of that 10Mb you're going
> to be downloading 6Mb of it anyway.

Your math is off.  The statistics given were that, out of mail *received
from IPs in the DSBL*, 99.95% were spam.  If master handles 6000 emails
a day, 30% is spam and only 40% of the spam comes from open relays
(which seems to be what you're saying above), that gives you one false
positive every two days from a user who will get a bounce telling them
to use a different mail server (1 false positive on a total mail volume
of 16000 messages), and it still blocks 40% of the spam.

> > To me, *that* is unacceptable, but fortunately there is a
> > recipients_reject_except variable in exim. Assuming you will not ask
> > to be taken out of this variable, do you have anything against me not
> > being in it? Or are you so anti-DSBL that you believe I should be
> > *forced* to receive and handle so much crap?

> I think the point is he doesn't beleive *Debian* should be doing the
> filtering. You are completely free to do it yourself, you are not completely
> free to use somebody else's resources to do it. You may ofcourse ask nicely
> (that's what this thread is all about, right?)

I use spamassassin, which works fine for me because I have no shortage
of bandwidth; but I'm sympathetic to those who don't have the luxury.
Even a 12% reduction in the amount of resources (human and computer)
spent processing and discarding debian.org spam is nothing to scoff at,
and in my experience, your 30% figure is a severe underestimate.  This
particular sort of filtering, IP-based filtering, can only be done at
the point of entry into the legitimate email network; and with only a
.05% false-positive rate and a drastic reduction in the resources
needed to identify and discard each spam, I would jump at the chance to
implement this if it were my mail server.

Steve Langasek
postmodern programmer

Attachment: pgpIAtDoXZ5sZ.pgp
Description: PGP signature

Reply to: