Measuring "should I greylist?" false positive rate [was: greylisting on debian.org?]
On Mon, Jul 17, 2006 at 11:48:21PM +0200, Pierre Habouzit wrote:
> Le lun 17 juillet 2006 22:29, Lionel Elie Mamane a écrit :
> the discussion (...) was about enabling greylisting on *certain*
> *specificaly* *suspicious* hosts. a suspicious
> host is:
> * either listed on some RBL's (rbl listing "dynamic" blocks are a good
> start usually)
> * either having no reverse DNS set
> * either having curious EHLO lines (that one may catch too much good
> mail sadly, so it's to handle with care).
> * ...
> I apply greylisting on the two first criteriums on a quite used mail
> server (around 300.k mails per week, which is not very big, but
> should be representative enough).
> there is less than 50 mails a week over those that *may* be
> legitimate mails that are actually slowed down.
On second thought, I'm very interested in how you measured this false
positive rate. Do all the recipients of the 300k mails per week check
for every mail if it was greylisted (that means you would put a header
or something like that saying "this mail was greylisted"?), and they
_always_ check on _every_ legitimate mail and _always_ report false
positives to you? Probably not. So, are these 50 mails a week all the
mail that undergoes greylisting but *still* goes through (i.e. gets
retried, roughly)? Something else?