[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Making spam-scanning less resource-hungry (was: procmailrc)

On Mon, Sep 15, 2003 at 06:59:54PM +0200, Josip Rodin wrote:
> On Mon, Sep 15, 2003 at 10:41:17AM +0100, Colin Watson wrote:
> > I think we need to do something like delivering to a queue (e.g.
> > spool/incoming/S*) which is spam-scanned close to linearly and
> > separately from exim. Getting the mail out of exim's hands as
> > quickly as possible seems to be the right thing to do.
> This will go hand in hand with implementing CrossAssassin. I propose
> we do it with a 2 minute delay as a start. That will kill off the vast
> majority of spam in the BTS, with much less resources and a negligible
> timing hit.

[Moving to -debbugs as it may be of general interest.]

So, let's consider a spool/pending directory. We make scripts/receive
write everything there, with the same basename as before. We create a
new scripts/scanpending that runs over spool/pending in much the same
way as scripts/processall but instead runs the mail through spamc and
either moves it to spool/incoming or appends it to some spam-bin mail
folder, depending on X-Spam-* headers in the output. This is separate
from scripts/processall in order that a large unscanned queue doesn't
prevent mail we've already scanned from being processed in good time.

CrossAssassin could easily be added to this later, so you'd then scan
only mails older than two minutes, and to start with we could get the
scanner to accept all mail (leaving the spamc call in procmail) until
the spool changes are working.

The one thing that worries me is that I'm not sure if we can generate
proper bounces. Can somebody who knows more about mail systems than I
check this? We'd also need to lock the spam-bin properly.

Comments or brick(text)bats?

Colin Watson                                  [cjwatson@flatline.org.uk]

Reply to: