[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: urgent - system slow after spamassassin

On Tue, Dec 17, 2002 at 10:57:59AM +0530, Sandip P Deshmukh wrote:
| On Mon, Dec 16, 2002 at 08:47:00AM -0800, Bob Nielsen wrote:

| > If you use spamc, which is the daemonized version of spamassassin, it
| > will run somewhat faster, since it is written in C rather than perl. 
| i did dselect and searched spamc - it was not there. is there a debian
| package for spamc?

Yes.  In old versions of spamassassin the 'spamc' (and corresponding
'spamd') programs are in the "spamassassin" package.  In new versions
there is a package named "spamc".  Always use the newest spamassassin
package.  As was noted yesterday, the old version running on
lists.debian.org flagged a debian announcement as spam while the new
version (running on my system amongst others) didn't.  You'll (almost)
always get better results with the newest version of the rules and

| > It only helps a little, however, since it still needs to invoke
| > multiple exim processes.  I ended up also running fetchmail as a daemon
| > as well, getting messages from my ISP every ten minutes or so to avoid
| > being overrun by a large number of concurrent messages.
| i also experienced the same problem! what is the solution for avoiding
| concurrent messages?

Don't scan messages concurrently.  There are a number of ways to
achieve that.  As you have no doubt noticed by now, there are very
many pieces that make up a mail handling system.  Each one does
something slightly different and each site/user can plug them together
a little differently and achieve a different (or similar) effect.  One
way to limit concurrency is to use the "-m" option to spamd (see the
manpage).  Another way is to use the "deliver_load_max" and similar
options in exim.conf.  Yet another way, if you use procmail, is to use
a lockfile on the recipe that does the scanning.  Each one has
different tradeoffs in terms of how and where the serialization /
blocking is done and what resources are consumed in the process.

On Tue, Dec 17, 2002 at 11:02:07AM +0530, Sandip P Deshmukh wrote:
| i did install razor. do you mean that if i remove razor, things will get
| better?

Yes.  In my experience razor isn't terribly accurate, and it does add
overhead due to calculating a checksum and then whatever network
delays there are.

| i was under an impression that spamassassin 'needs' razor to run
| properly.

No.  It can use razor as just another rule, or it can simply not use
it at all.

| this brings me to a question, how is spamassassin better than a
| .forward file that can do a somewhat similar job?

You /could/ create an "exim filter" (not a standard .forward)
almost-reimplementation of spamassasin.  You don't want to do that.
SA is a collection of rules, many are regular expressions but some are
perl routines.  Each rule has a score.  The rules are applied to the
message to see if it matches or not.  The scores of the matching rules
are summed and the result is compared to a threshold.  The message is
then marked with the results.  SA is better than a homegrown
collection of filters because :
    .   many more people are working together on the rules
    .   they test the rules first, so you don't need to worry as much
    .   the scores are not arbitrary  --  most of them are computed
            based on a corpus of known spam and known not-spam.
    .   over time the ruleset becomes ineffective, and over time new
            releases of SA are published.  You would need to
            continually tweak and update your homegrown filters.


Reckless words pierce like a sword,
but the tongue of the wise brings healing.
        Proverbs 12:18

Attachment: pgp7AGWJtIxlh.pgp
Description: PGP signature

Reply to: