[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: SPAM filtering (was Re: SPAM fiiltering)



On Fri, 2002-11-01 at 12:19, Bob George wrote:
> Mark L. Kahnt wrote:
> 
> > [...]
> > SpamAssassin 2.43-1 - tried upgrading it tied to unstable and it says
> > that I'm at the newest version. Basically, it was clobbering anything in
> > html on a commercial mailing list, such as personalised horoscopes.
> > These usually have some promotion of features on the originating website
> > - enough to trigger spam filtering. Unfortunately, my attempts to rescue
> > these with whitelisted addresses has proven useless as I'd said, because
> > by my experience, SpamAssassin doesn't always read or apply that list.
> 
> My situation may be similar to yours. I'm on about 40 commercial mailing 
> lists of various types, including yahoogroups which include all manner 
> of advertising, web bugs and other 'spammish' content.
> 
> It's worth noting that matching the whitelist only DECREMENTS the score 
> applied against a particular message. It doesn't automatically make the 
> message "good". There are various settings to further tweak whitelist 
> entries (see the manpage for Mail::Spamassassin::Config and the 
> whitelist_to, more_spam_to and all_spam_to entries. Depending on how 
> your mailing list works, this may be all the correction required. The 
> defaults are:
> 
> score USER_IN_WHITELIST_TO           -6
> score USER_IN_MORE_SPAM_TO           -20
> score USER_IN_ALL_SPAM_TO            -100
> 
> These only match on message sent TO you however. There's also a 
> "whitelist_from_received" option that may be needed to match for mailing 
> lists if the To: line isn't meaningful.
> 
> It's also worth spending a few minutes debugging the rules spamassassin 
> applies to a particular list. If a mailing list changes sending address 
> frequently, or doesn't address message To: you, the whitelist entries 
> can easily be missed altogether. If you check and see that a particular 
> rule is catching most of the list mailings you want to see, you can 
> override the scoring for that rule. For example, I have a separate 
> account that I use for mailings lists and other "risky" exchanges. In 
> it, I've tweaked a few rules in ~/.spamassasin/user_prefs to avoid false 
>   positives (or rather allow what would otherwise be spam) to this 
> account (defaults shown in parens):
> 
> score CTYPE_JUST_HTML 2.459 (0.407)
> score LOTS_OF_CC_LINE 3.060 (0.817)
> score SUSPICIOUS_RECIPS 5 (0.931)
> score VERY_SUSP_RECIPS 5.262 (0.326)
> score EGP_HTML_BANNER     -6.039 (-3.052)
> 
> (Rule names can be derived from the report.) And of course, if 
> everything seems to be JUST BARELY crossing the spam threshold (default 
> 5), you can tweak that as well:
> 
> required_hits n.nn
> 
> With these tweaks, and a couple of whitelist & blacklist entries, things 
> have been working very well for me with only minimal maintenance.I do 
> still file spam in a folder and review it before final deletion, as 
> there are still the occasional "spams I want" coming from lists.
> 
> I've come to the conclusion that the high number of false positives I 
> was getting weren't due to spamassassin, but rather from the fact that I 
> do like SOME spam. :)
> 
> Good luck with it!
> 
> - Bob

I don't contend that SpamAssassin is a bad program - I see that it can
be refined and should be quite good when working, and I'm the first to
admit that the horoscopes and the like are smacking of spammy-ness if
they hadn't been solicited.

That said, the problem is that the whitelist isn't always being honoured
- I have the charts of infractions in the flagged messages, and while
the "From" header is identical amongst them and the address was
whitelisted, it wasn't always honoured.

I'll work around it over the next few weeks - I have a few too many
things on the go at the moment to bury my nose in this while I'm trying
to develop an emergency reference manual for a suicide and distress
helpline (that is a tad higher priority than grumbling about a "Slutty
Suzy is Spread for Sex" email.)
-- 
Mark L. Kahnt, FLMI/M, ALHC, HIA, AIAA, ACS, MHP
ML Kahnt New Markets Consulting
Tel: (613) 531-8684 / (613) 539-0935
Email: kahnt@hosehead.dyndns.org

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: