[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Making spam-scanning less resource-hungry (was: procmailrc)

On Mon, Sep 15, 2003 at 06:43:54PM +0100, Colin Watson wrote:
> So, let's consider a spool/pending directory. We make scripts/receive
> write everything there, with the same basename as before. We create a
> new scripts/scanpending that runs over spool/pending in much the same
> way as scripts/processall but instead runs the mail through spamc and
> either moves it to spool/incoming or appends it to some spam-bin mail
> folder, depending on X-Spam-* headers in the output. This is separate
> from scripts/processall in order that a large unscanned queue doesn't
> prevent mail we've already scanned from being processed in good time.

I don't think receive should have anything to do with it. Procmail can do it
on its own. Maybe I should just paste the relevant code, I see my attempts
at explaining aren't succeeding :)


* !$ ${ca_headerfield}:

        :0 c

        | $ca_messagedigests $list $domain >> ../$ca_mdtoday


* $^${ca_headerfield}: yes
        :0 fhw
        | formail -I ${ca_headerfield}:


:0 fhw
| formail -I ${ca_headerfield}:

Cron jobs:

# Flush queues every 5 minutes:
*/5 * * * * list ca_flushqueues

# Shift md files every day.
00  0 * * * list ca_shiftknownmd

ca_flushqueues does the obvious thing, the actual processing of the queued
files (if not crossposts, feeding them into receive in our case).
ca_messagedigests and ca_shiftknownmd jumble the digests in the $ca_mdtoday

There's a reasonably new version of the crossassassin deb in
http://people.debian.org/~sanvila/ (albeit I don't think even the newest one
has the ca_flushqueues that I wrote for lists.debian.org).

> The one thing that worries me is that I'm not sure if we can generate
> proper bounces.

Eh, when do we ever want to do <> bounces?

     2. That which causes joy or happiness.

Reply to: