[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian 8: Postfix -> amavisd-new -> spamassassin -> Bayes : not scanning?



Dear fellow Debian users,

it seems that I've found the correct answer.

In /etc/spamassassin/local.cf, 
in addition to the aforementioned:
  use_bayes 1
  bayes_auto_learn 1
I have added:

  use_bayes_rules 1

Found when trawling the  /usr/share/perl5/Mail directory,
namely discovered in SpamAssassin/Conf.pm.
Looked promising, so I tried it. How silly.

That one line has caused some difference on the inside,
as a result of which, I now have a BAYES score in the
X-Spam-Status header in every message.
A remaining trouble is that all the scores so far come out as
BAYES_00 :-) so I may have to work on that some more.
No SPAM has arrived yet, to provide a proper test.
(I get 2-3 a day in my inbox - the rest is taken care of
by greylisting and the general SpamAssassin scoring rules.)

Other possibly interesting options:
   bayes_use_hapaxes
   bayes_auto_expire
   bayes_token_ttl
   bayes_seen_ttl

Actually I've managed to get a backtrace from one function that I 
could identify as getting called:
in /usr/share/perl5/Mail/SpamAssassin/BayesStore/DBM.pm :
sub tie_db_readonly {
...
  my $iii = 1;
  print dbg("Stack Trace:");
  while ( (my @call_details = (caller($iii++))) ){
  dbg( $call_details[1].":".$call_details[2]." in function" . \
	$call_details[3] );
  }

...which did produce a neat stack trace. I'm attaching it, if 
anyone's interested.
The code was taken almost verbatim from
https://stackoverflow.com/questions/229009/how-can-i-get-a-call-stack-
listing-in-perl

In the stack trace I could see that something inside Amavis goes 
"have this message scanned", but some lower layers (across several 
indirections) got asked "is_scan_available" and 
"learner_is_scan_available". Funny, that...

I've also noticed that 
/usr/share/perl5/Mail/SpamAssassin/Bayes.pm contains a note, saying

# This is the general class used to train a learning classifier with 
# new samples of spam and ham mail, and classify based on prior 
# training.
# 
# Prior to version 3.3.0, the default Bayes implementation was here; 
# if you're looking for information on that, it has moved to
#    Mail::SpamAssassin::Plugin::Bayes   .

And yes indeed, there's another file:
/usr/share/perl5/Mail/SpamAssassin/Plugin/Bayes.pm
containing the function  check_bayes() where I'd previously
put my dbg() trap...

...so I thought: "maybe SpamAssassin.pm was 'requiring' the wrong 
module?"  
But that doesn't seem to be the case... (I've tried :-)

Instead, after I added  
     use_bayes_rules 1
I started to get BAYES scores in the mail headers.
That's probably a good start :-)

Thanks to everyone who has responded to reassure me :-)

Frank


On 9 Jul 2017 at 23:26, debian-user@lists.debian.org wrote:
>
> Dear polite people in the debian-users mailing list,
> 
> I would appreciate any help with the following setup.
> For the record, I'm sending this same text to the 
> SpamAssassin "users" mailing list - I'm not technically
> cross-posting, as that would probably earn me a bad
> reputation (or a kick).
> 
> I've just built a new mailserver based on Debian 8.8,
> with Postfix + Cyrus. I have a long history of using
> Amavis with SpamAssassin for SPAM filtering.
> On the newly installed machine, there is 
> SpamAssassin 3.4.0-6 = the current version for Jessie.
> 
> And within SpamAssassin, my previous server (based on
> Debian Squeeze) was using the Bayesian filter.
> Using 
>   sa-learn --backup 
>   sa-learn --restore=...
> I have migrated the Bayes database to the new machine,
> and after a few path tweaks and privilege adjustments,
> I got sa-learn-cyrus to do its job.
> 
> Curiously to me, I don't see any BAYES scores
> in the X-Spam-Status header. I suspect that the Bayes
> plugin does not actually get called to evaluate
> the messages passing through my server.
> 
> In /etc/spamassassin/local.cf, I have the following:
> use_bayes 1
> bayes_auto_learn 1
> bayes_path /var/lib/spamassassin/.spamassassin/bayes
> ...a couple of whitelist_from rules, and 
> add_header all Report _REPORT_
> 
> 
> In /etc/amavis/conf.d/15-content_filter_mode, I have UNcommented this:
> 
> @bypass_spam_checks_maps = (
>    \%bypass_spam_checks, \@bypass_spam_checks_acl, 
> \$bypass_spam_checks_re);
> 
> 
> In /etc/amavis/conf.d/50-user , I have the following:
> 
> $DO_SYSLOG = 0;
> $LOGFILE = "/var/log/amavis.log";
> $sa_tag_level_deflt = -9999; # always add spam info headers
> 
> $log_level = 1;
> $sa_debug = 1;
> 
> I've also tried log_level = 2, which showed me a privilege problem,
> where the SA's Bayes plugin couldn't create a lock file... so that's
> handled too. I'm getting *some* notes about the Bayes plugin in the
> amavis log:
> 
> Jul  9 21:25:54 mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA
> dbg: bayes: tie-ing to DB file R/O
> /var/lib/spamassassin/.spamassassin/bayes_toks Jul  9 21:25:54
> mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: bayes:
> tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_seen
> Jul  9 21:25:54 mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA
> dbg: bayes: found bayes db version 3 Jul  9 21:25:55 mail
> /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'learn_message', priority 0 Jul  9 21:25:55 mail.x.y.z
> /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: locker: safe_lock:
> created
> /var/lib/spamassassin/.spamassassin/bayes.lock.mail.fccps.cz.8868 Jul 
> 9 21:25:55 mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg:
> locker: safe_lock: trying to get lock on
> /var/lib/spamassassin/.spamassassin/bayes with 0 retries Jul  9
> 21:25:55 mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg:
> locker: safe_lock: link to
> /var/lib/spamassassin/.spamassassin/bayes.lock: link ok Jul  9
> 21:25:55 mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg:
> bayes: tie-ing to DB file R/W
> /var/lib/spamassassin/.spamassassin/bayes_toks Jul  9 21:25:55
> mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: bayes:
> tie-ing to DB file R/W /var/lib/spamassassin/.spamassassin/bayes_seen
> Jul  9 21:25:55 mail.x.y.z /usr/sbin/amavisd-new[8868]: (08868-01) SA
> dbg: bayes: found bayes db version 3 Jul  9 21:25:55 mail.x.y.z
> /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: bayes: learned
> 'd963c4a7f11e91c3bd3317ea92408c2013c99dad@sa_generated', atime:
> 1499628354 Jul  9 21:25:55 mail.x.y.z /usr/sbin/amavisd-new[8868]:
> (08868-01) SA dbg: bayes: untie-ing Jul  9 21:25:55 mail.x.y.z
> /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: bayes: files locked,
> now unlocking lock Jul  9 21:25:55 mail.x.y.z
> /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: locker: safe_unlock:
> unlink /var/lib/spamassassin/.spamassassin/bayes.lock
> 
> 
> Makes me wonder if the "implements" messages can mean something (no
> "scan" operation?):
> 
> 
> Jul  9 21:25:21 mail.x.y.z /usr/sbin/amavisd-new[8850]: SA dbg: 
> plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'learner_new', priority 0 Jul  9 21:25:21 mail.x.y.z
> /usr/sbin/amavisd-new[8850]: SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'learner_is_scan_available', priority 0 Jul  9 21:25:22 mail.x.y.z
> /usr/sbin/amavisd-new[8850]: SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'learner_close', priority 0 Jul  9 21:25:22 mail.x.y.z
> /usr/sbin/amavisd-new[8850]: SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'prefork_init', priority 0 Jul  9 21:25:22 mail.x.y.z
> /usr/sbin/amavisd-new[8868]: SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'spamd_child_init', priority 0 Jul  9 21:25:22 mail.x.y.z
> /usr/sbin/amavisd-new[8869]: SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'spamd_child_init', priority 0 Jul  9 21:25:55 mail.x.y.z
> /usr/sbin/amavisd-new[8868]: (08868-01) SA dbg: plugin:
> Mail::SpamAssassin::Plugin::Bayes=HASH(0x6bc65b0) implements
> 'learn_message', priority 0
> 
> 
> But looking into the PluginHandler.pm, these messages possibly 
> point to some "unexpected" sub names. Perhaps the "check"
> sub is just "too common to be worth mentioning"...
> 
> 
> In /usr/share/perl5/Mail/SpamAssassin/Plugin/Bayes.pm, 
> in the check_bayes() subroutine, I have added a debug message,
> to see if that sub gets called at all:
> 
> sub check_bayes {
>   my ($self, $pms, $fulltext, $min, $max) = @_;
>   dbg("bayes: check_bayes() called");
> 
> And the result is... no it doesn't get called.
> The message doesn't get logged.
> Nor do I see messages from the scan() sub,
> which should report a score into the log,
> with $sa_debug = 1;
> 
> 
> Unfortunately, I don't have the ... grey matter to
> follow the "call stack" up towards Amavis, to see 
> exactly where the Bayes check gets avoided.
> Too many indirections for my lay brain :-)
> 
> Any help would be much appreciated.
> 
> Frank Rysanek
> 
> 
> 


The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

   ---- File information -----------
     File:  stacktrace.txt
     Date:  10 Jul 2017, 14:31
     Size:  9847 bytes.
     Type:  Text

Attachment: stacktrace.txt
Description: Binary data


Reply to: