Re: IMPORTANT: your message to html-tidy

On Wed, Sep 10, 2003 at 12:15:38AM -0700, Steve Lamb wrote:
> On Wed, 10 Sep 2003 15:46:28 +1000
> Craig Sanders <cas@taz.net.au> wrote:
> > my system rejected them as spam, so they were spam (or so likely to be spam
> > as makes no difference).
>     There is a difference between it being spam because of it coming from an
> IP block and it actually being spam.  

you mean crap like this:

Sep  7 07:14:55 taz postfix/smtpd[27064]: BD8DD14067D: reject: RCPT from unknown[]: 550 Service unavailable; Client host [] blocked using cn-kr.blackholes.us; Korea blocked by cn-kr.blackholes.us; from=<darke@twilight.vtc.vsc.edu> to=<rch@taz.net.au> proto=SMTP helo=<compuserve.com>
Sep  7 07:15:12 taz postfix/smtpd[27064]: AB84214067D: reject: RCPT from unknown[]: 550 Service unavailable; Client host [] blocked using cn-kr.blackholes.us; Korea blocked by cn-kr.blackholes.us; from=<rantapaa@lafcol.lafayette.edu> to=<rch@taz.net.au> proto=SMTP helo=<compuserve.com>
Sep  7 07:15:27 taz postfix/smtpd[27064]: 2AD5014067D: reject: RCPT from c213-89-7-109.cm-upc.chello.se[]: 550 Service unavailable; Client host [] blocked using list.dsbl.org; http://dsbl.org/listing?ip=; from=<cflatter@research.canon.oz.au> to=<rch@taz.net.au> proto=SMTP helo=<compuserve.com>
Sep  7 07:16:01 taz postfix/smtpd[27064]: 75A26140680: reject: RCPT from unknown[]: 550 Service unavailable; Client host [] blocked using list.dsbl.org; http://dsbl.org/listing?ip=; from=<cypher@fil.org> to=<rch@taz.net.au> proto=SMTP helo=<compuserve.com>

and many thousands more?

i know for a fact that these are spam.  i know because the sender address is
forged, and i know beacuse the HELO hostname is forged.  i also know because
rch@taz.net.au does not exist and never has existed.

if these spams weren't caught by my use of cn-kr.blackholes.us and other RBLs,
they would be caught by my other anti-spam rules.  most spams are, that's why
so few make it through to be tagged by SA.  i use multiple rules because that
has the greatest chance of catching the greatest amount of spam.

> Or are you saying that it is inconceivable that someone in Taiwan using
> Debian might want to contact you?

if someone really needs to contact me, they could send it from another IP
address (e.g. a webmail service) or send it to postmaster@ or abuse@ - which
are both in my spamlovers map which overrides all anti-spam rules (except for
body & header checks)

> > i choose not to, because there is a fairly high risk of SMTP session
> > timeouts when the system is under heavy load, resulting in a) repeat
> > attempts to deliver the same mail to my server (wasting more bandwidth and
> > CPU power to scan it), and b) the small possibility of an undesirable
> > feedback loop of ever-increasing loadavg.
>     Given the stats here of people reporting the time per second of SA
> scanning at SMTP time and your reported load I can assure you that the chances
> of you getting SMTP timeouts because of SA would be remote at best.  You're
> not AOL, you're not Earthlink/Mindspring.  You've got, what, 40 users total?

sorry, a system that only works sometimes (or even most of the time) is a
broken system.

i prefer to know that my system's behaviour will be consistent and correct.

> > 382 tagged in a week, about half of which go to my SPAM.incoming folder (the
> > rest go to other users, so don't concern me directly).  most are nigerian
> > 419 scams and can be ignored, a handful have extra domains/ip
> > addresses/phrases to add to my lists.
> http://lists.debian.org/debian-user/2003/debian-user-200308/msg00154.html
>     4 a week here.  Granted my volume seems to be lower than yours so lets
> scale up.  You get 25k/week, I get 2324 a week.  Round to 2300 for easier
> math.  2300/25k = 9.2%  So, 4 * 10 (round up since its easier and adds, not
> subtracts, to my total) = 40. At your mail load I'd get 40/week delivered,
> tagged as spam by SA. The spam that gets through undetected completely is a
> magnitude less. 4/month at your load would be about right. Of course that is
> for my personal account.  I did say it was 22/week total for all account. 

and there's the catch.  like you, i have multiple accounts and multiple aliases
pointing at my accounts.  postmaster@taz.net.au for instance, which (in order
to be RFC-compliant) isn't subject to most of my postfix access rules (although
it still gets processed by spamassassin).

> Even so, 220 is 160 less than what you're pawing through.  

no, it's more.  of the 382 that made it through last week, about half ended up
in one of my mailboxes.  the rest were my users' problem.  half of 382 is 191.

> Mind you that 220 figure also comes from a presumed linear progression of
> spam at the current scoring levels.  I do not believe that to be the case.
> If I were getting more spam I believe SA would reject a larger percentage
> outright.  But hey, that would help my case so I'll be nice and let that
> slide.

if you're talking percentages, then i know what percentage of spam that i
reject (like you, i'm not counting false-negatives, there's not enough to
count): 98.5% last week.  the figure varies from week to week, but 98.5% is not
at all unusual.

> So what miraculous things am I doing.  SA at SMTP, reject anything over 8,
> tag and deliver anything between 5-8, autolearn over 12 and just to be nasty,
> teergrube anything over 15.  Back then I had exactly 0 custom SA rules.
> Today I have 6 rules which I could consolidate down to 1 if I felt like
> making a more complex regex; systems accounts for the machine I secondary
> for, wonderful honeypot for me.  Once a day I fire up a tool I wrote to help
> me sort through the narrow range thats let through and I hit one of 3
> buttons:

i'm glad that your system works for you.

similarly, i'm glad that my system works for me.

we both have systems we're happy with - now, isn't that nice?

> > not exactly what i'd call an excessive or obsessive work-load.
> > it's not even enough of a workload to bother using procmail rules to drop
> > extremely high scoring spams into /dev/null
>     Really.  How many postfix rules do you have that you hand crafted?  

i have thousands of custom rules, however "hand-crafted" isn't exactly the
right word.  "scripted" is more accurate.

adding a new domain is as simple as

cd /etc/postfix
./add-dom -b spamdom1.com spamdom2.com spamdom3.com .... spamdomN.com

then i cut and paste (and possibly change - like i said, i'm far less tolerant
on my home server than i am at work) the same ./add-dom command into my main
mail server at work and run make....and have the SA rules, access maps, etc
automatically generated and copied to all the mail servers with scp.

the domains themselves i find by pressing ^B in mutt.  that pipes the message
through a trivial Q&D hack url-decode script and then into urlview.

---cut here---
#! /usr/bin/perl

use URI::Escape;
use MIME::QuotedPrint;
use HTML::Entities;

# works from the command line or as a filter
$str = join(" ",@ARGV) ;
if ($str eq "") {
    while(<>) { $str .= $_ } ;

$decoded = uri_unescape($str);
$decoded = decode_qp($decoded);
$decoded = decode_entities($decoded);

print "$decoded\n" ;
---cut here---

i do this more because it gives me a sense of satisfaction to explicitly block
the bastards than because it's strictly necessary.  it does tend to increase
the SA score a lot, though, which is very satisfying to see.

where it is extremely useful is that my /etc/postfix/add-dom script adds the
domain to both the SA rules AND to postfix access maps....so if a spammer is
stupid enough to use one of their domains in the smtp envelope (many of them
are, even today), the message is rejected outright.

>     22 of 2324/week.  .009% by my math.  That's 1.41% less than you for a lot
> less work.  Learn to use your tools, man.

i think i know how to use my tools a lot better than you do.

now run off and play.  when you've worked on real mail servers under real
loads, you'll be qualified to comment on my methods.


