Spam management and sa-learn

To: debian-user Mailing List <debian-user@lists.debian.org>
Subject: Spam management and sa-learn
From: Stefano Sabatini <stefano.sabatini-lala@poste.it>
Date: Fri, 4 Jan 2008 17:34:07 +0100
Message-id: <[🔎] 20080104163407.GA24793@geppetto>
Mail-followup-to: debian-user Mailing List <debian-user@lists.debian.org>

Hi all Debian users,

I have this setup for mail:

exim4 as MTA, fetchmail to fetch mails from different pop3 servers
which delivers mails to procmail and procmail calls spamc and finally
I'm using mutt as my mail reader.

I'm used to recollect spam messages in an archive named
~/Mail/archive/recent/spam, which contains *all* the spam messages
collected in the last 6 months (weekly refreshed by an anacron script).

In order to train spamassassin I also run weekly this anacron script:

#! /bin/bash

MAILROOT=$HOME/Mail

# maildir inboxes
# learn what is ham
sa-learn --ham $MAILROOT/archive/recent/generic/*

#learn what is spam
sa-learn --spam $MAILROOT/archive/recent/spam/*

both ~/archive/recent/spam and ~/archive/recent/generic are maildir
mailbox (this is the reason I use the * to match cur, new and tmp).

The problem with this setup is that I continue to get a *large* number
of spam messages in my generic inbox (which contains no mailing-list
mails), in the order of more then 50 messages per day, and I'm getting
tired to manually filter it, while most of spam messages (200+) are
directly addressed in the ~/inbox/probably-spam directory.

So my question is: what's wrong with this setup, in particular can you
suggest how to improve the spamassassing training?

Also I would like to avoid to encapsulate messages detected as spam as
it currently spamc does encapsulating in a message starting like this:

"Spam detection software, running on the system "santefisi.caos.org", has
identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
the administrator of that system for details."

Can you suggest which option I have to switch off?

Also I ask if such messages are correctly classified when the sa-learn
is run against them (maybe *this* the problem).

Also have you suggestions for how to improve this spam management
system for this one-user system?

Many thanks, regards and and an happy debianish new year.
-- 
Stefano Sabatini
Linux user number 337176 (see http://counter.li.org)

Reply to:

Prev by Date: Re: Debian branches and when a packet is moved from testing to stable
Next by Date: Re: Anyone run nvidia driver + latest xorg 7.3?
Previous by thread: Re: Compatibility with ASUS P5E-VM HDMI
Next by thread: losing tmp pdf files from iceweasel
Index(es):
- Date
- Thread