On Tue, Dec 17, 2002 at 10:57:59AM +0530, Sandip P Deshmukh wrote:
| On Mon, Dec 16, 2002 at 08:47:00AM -0800, Bob Nielsen wrote:
| > If you use spamc, which is the daemonized version of spamassassin, it
| > will run somewhat faster, since it is written in C rather than perl.
|
| i did dselect and searched spamc - it was not there. is there a debian
| package for spamc?
Yes. In old versions of spamassassin the 'spamc' (and corresponding
'spamd') programs are in the "spamassassin" package. In new versions
there is a package named "spamc". Always use the newest spamassassin
package. As was noted yesterday, the old version running on
lists.debian.org flagged a debian announcement as spam while the new
version (running on my system amongst others) didn't. You'll (almost)
always get better results with the newest version of the rules and
scores.
| > It only helps a little, however, since it still needs to invoke
| > multiple exim processes. I ended up also running fetchmail as a daemon
| > as well, getting messages from my ISP every ten minutes or so to avoid
| > being overrun by a large number of concurrent messages.
|
| i also experienced the same problem! what is the solution for avoiding
| concurrent messages?
Don't scan messages concurrently. There are a number of ways to
achieve that. As you have no doubt noticed by now, there are very
many pieces that make up a mail handling system. Each one does
something slightly different and each site/user can plug them together
a little differently and achieve a different (or similar) effect. One
way to limit concurrency is to use the "-m" option to spamd (see the
manpage). Another way is to use the "deliver_load_max" and similar
options in exim.conf. Yet another way, if you use procmail, is to use
a lockfile on the recipe that does the scanning. Each one has
different tradeoffs in terms of how and where the serialization /
blocking is done and what resources are consumed in the process.
On Tue, Dec 17, 2002 at 11:02:07AM +0530, Sandip P Deshmukh wrote:
| i did install razor. do you mean that if i remove razor, things will get
| better?
Yes. In my experience razor isn't terribly accurate, and it does add
overhead due to calculating a checksum and then whatever network
delays there are.
| i was under an impression that spamassassin 'needs' razor to run
| properly.
No. It can use razor as just another rule, or it can simply not use
it at all.
| this brings me to a question, how is spamassassin better than a
| .forward file that can do a somewhat similar job?
You /could/ create an "exim filter" (not a standard .forward)
almost-reimplementation of spamassasin. You don't want to do that.
SA is a collection of rules, many are regular expressions but some are
perl routines. Each rule has a score. The rules are applied to the
message to see if it matches or not. The scores of the matching rules
are summed and the result is compared to a threshold. The message is
then marked with the results. SA is better than a homegrown
collection of filters because :
. many more people are working together on the rules
. they test the rules first, so you don't need to worry as much
. the scores are not arbitrary -- most of them are computed
based on a corpus of known spam and known not-spam.
. over time the ruleset becomes ineffective, and over time new
releases of SA are published. You would need to
continually tweak and update your homegrown filters.
-D
--
Reckless words pierce like a sword,
but the tongue of the wise brings healing.
Proverbs 12:18
http://dman.ddts.net/~dman/
Attachment:
pgpgilwN62fWs.pgp
Description: PGP signature