On Mon, Jun 16, 2003 at 12:20:20AM +1000, Martijn van Oosterhout wrote: > On Sun, Jun 15, 2003 at 03:58:47PM +0200, Santiago Vila wrote: > > For every potential false positive that you might miss because of > > Debian eventually using the DSBL, I have to receive, download and > > handle 1999 spam messages, which means 10Mb of spam (the average > > spam message is a little more than 5K). > How many emails does master handle in one day? Probably several thousand. > That adds up to several false positives per day. Also, your estimate of 50% > spam is too high IMHO. If you take a 70/30 split, you get one false positive > per 900 emails. Finally, as pointed out earlier, only 40% of measured spam > actually came from one of the listed servers. So of that 10Mb you're going > to be downloading 6Mb of it anyway. Your math is off. The statistics given were that, out of mail *received from IPs in the DSBL*, 99.95% were spam. If master handles 6000 emails a day, 30% is spam and only 40% of the spam comes from open relays (which seems to be what you're saying above), that gives you one false positive every two days from a user who will get a bounce telling them to use a different mail server (1 false positive on a total mail volume of 16000 messages), and it still blocks 40% of the spam. > > To me, *that* is unacceptable, but fortunately there is a > > recipients_reject_except variable in exim. Assuming you will not ask > > to be taken out of this variable, do you have anything against me not > > being in it? Or are you so anti-DSBL that you believe I should be > > *forced* to receive and handle so much crap? > I think the point is he doesn't beleive *Debian* should be doing the > filtering. You are completely free to do it yourself, you are not completely > free to use somebody else's resources to do it. You may ofcourse ask nicely > (that's what this thread is all about, right?) I use spamassassin, which works fine for me because I have no shortage of bandwidth; but I'm sympathetic to those who don't have the luxury. Even a 12% reduction in the amount of resources (human and computer) spent processing and discarding debian.org spam is nothing to scoff at, and in my experience, your 30% figure is a severe underestimate. This particular sort of filtering, IP-based filtering, can only be done at the point of entry into the legitimate email network; and with only a .05% false-positive rate and a drastic reduction in the resources needed to identify and discard each spam, I would jump at the chance to implement this if it were my mail server. -- Steve Langasek postmodern programmer
Attachment:
pgpIAtDoXZ5sZ.pgp
Description: PGP signature