On Sat, Oct 19, 2002 at 10:36:47AM +1000, Craig Sanders wrote:
> the best compromise solution, imo, is to configure spamassassin so
> that actual spam scores very highly, discard to /dev/null anything
> that scores over 20, and quarantine for human review anything between
> 10 & 20.  allow anything under 10.  the goal is to refine the rules so
> that almost all real spam scores over 20 and never needs to waste
> anyone's time.

one other possibility that i'll experiment with in the next few days is
to use bogofilter to further automate this process.

at the moment, i'm thinking along the lines of:

0a. spamassassin is configured to score messages tagged "Yes" by
bogofilter very highly (10 or higher).

0b. seed bogofilter with a massive spam archive for the spamlist, and a
massive non-spam archive for the goodlist.

1. an incoming mail is filtered through bogofilter for an immediate
check.  bogofilter will tag it yes or no, but not discard or bounce any
found spam.

2. it is then fed into spamassassin.

3. if spamassassin tags it as non-spam, then:

  - it is fed into bogofilter for addition to the goodlist.
  - it is delivered as normal

4. if spamassassin tags it as spam, then:

  - it is fed into bogofilter for addition to the spamlist
  - it is quarantined for human review.  after a few months of building
	up bogofilter's database, this could be changed to just discard the

i'm not quite sure how this will work out at this point.  i need
experience to understand how the feedback loop will work in practice.

after a few months, it may even be possible to take spamassassin out of
the procedure as bogofilter may be "trained" well enough to identify
spam without any help from SA.


craig sanders <cas@taz.net.au>

Fabricati Diem, PVNC.
 -- motto of the Ankh-Morpork City Watch

