Re: smtp time spam filtering

To: debian-user@lists.debian.org
Subject: Re: smtp time spam filtering
From: David Hart <debian@tonix.org>
Date: Wed, 28 Feb 2007 14:07:00 +0000
Message-id: <[🔎] 20070228140700.GD13371@yan.jynn.tonix.org>
Mail-followup-to: debian-user@lists.debian.org
In-reply-to: <[🔎] 20070226154140.GB23293@fantomas.sk>
References: <[🔎] erk5ep$8p0$1$8302bc10@news.demon.co.uk> <[🔎] 1172158414.28933.27.camel@princess.gregfolkert.net> <[🔎] 20070222163808.GD4332@yan.jynn.tonix.org> <[🔎] 1172167054.25645.65.camel@princess.gregfolkert.net> <[🔎] 20070223153259.GB20882@yan.jynn.tonix.org> <[🔎] 20070223161647.GA16163@localhost.localdomain> <[🔎] 20070223191548.GA25122@yan.jynn.tonix.org> <[🔎] 20070224222415.GB14299@fantomas.sk> <[🔎] 20070226140443.GA26151@yan.jynn.tonix.org> <[🔎] 20070226154140.GB23293@fantomas.sk>

On Mon 2007-02-26 16:41:40 +0100 Matus UHLAR - fantomas wrote:
> > > On 23.02.07 19:15, David Hart wrote:
> > > > AFAIU no, but that's the way I do it with postfix.  Both my primary
> > > > and secondary MXs do RBL checks and stuff like recipient validation
> > > > and then make the accept/reject decision after the RCPT TO: but before
> > > > the DATA.
> > > > 
> > > > Greg Folkert said that he uses SA-Exim (which calls spamassassin)
> > > > to do scans at smtp time but without any online checks.  I don't see
> > > > how you can do this without receiving the bulk of the email.
> 
> > On Sat 2007-02-24 23:24:15 +0100 Matus UHLAR - fantomas wrote:
> > > the advantage of smtp time rejection is, you will just reject the data with
> > > error and you don't have to do anything with it - the rest is up to sender.
> 
> On 26.02.07 14:04, David Hart wrote:
> > Re-read the three paragraphs of mine that you quoted above (the first
> > of which you copied and pasted from an earlier email).
> 
> No, I didn't copy/paste anything - I only quoted the email as it came to me
[snip]

My mistake.  You didn't copy/paste from an earlier email but you did
move some attribution lines from their usual place.  Why not leave
them where (most?) people would expect to find them?

> (and hopefully removed all useless sections and kept only those usefull).

Everything that I write is useful ;-) but, I would agree, not
necessarily _relevant_ to the point that you or others wish to make.

> > I am NOT asking about the advantages of smtp time rejection.
> > The second of my paragraphs above make it quite clear that I do that
> > myself on my own MXs but that I do it BEFORE receiving the message DATA.
> 
> actually, there is a small difference between "receiving the message" and
> "receiving the message data", and maybe this is the reason why most people
> don't understand each other when talking about this issue.

I assumed in my previous mail that you know about the main parts in an
ordinary smtp conversation: HELO ... MAIL FROM: ... RCPT TO: ... DATA.
Am I wrong to assume this?

If you reject a mail after the RCPT TO: but before the DATA you have
only received the helo name and the message _envelope_.  At this
point you have not yet received even one message header.

My MTAs do RBL checks, recipient validation etc. and make the
accept/reject decision based on those checks _after_ the RCPT TO:
but _before_ the DATA.  _Please_ re-read the paragraph I wrote
that's quoted at the very top of this mail.  I thought it made this
point clear.

If the message passes those checks it will go through as normal and
should only be rejected if there is a serious problem such as a full
disk partition.

And lest there still be any room for misunderstanding here's a copy
of an actual session from my secondary MX:

  Transcript of session follows.
    Out: 220 crest.tonix.org ESMTP Postfix
    In:  HELO 82-36-220-114.cable.ubr05.king.blueyonder.co.uk
    Out: 250 crest.tonix.org
    In:  MAIL FROM:<bbandesha@communicate.com>
    Out: 250 Ok
    In:  RCPT TO:<david@tonix.org>
    Out: 554 Service unavailable; Client host [82.36.220.114] blocked using
            list.dsbl.org; http://dsbl.org/listing?82.36.220.114
  Session aborted, reason: lost connection

FWIW, I could configure postfix to do the RBL checks between the HELO
and the MAIL FROM: but, particularly as I relay some mail on to other
users, I do it later so that I can log whose mail I'm rejecting.

And to answer the point that you made in your last paragraph: yes
there _is_ a small but important difference between "receiving the
message data" and "receiving the message": a 250 response from the
receiving MTA which acknowleges receipt.

> > I AM asking how you can scan an email through spamassassin without
> > receiving the bulk of the email and how, when the scanning is turned
> > off, it leads to a quadrupling of bandwidth used.
> 
> I don't think that's quadrupling, but Yes, I agree this will cause MORE data
> sent across the net, and Yes, I agree that you must accept at least a part
> of the message data to verify if it's spam (the same about viruses).

To be clear, it wasn't me that made the assertion about the
quadrupling of bandwidth but I _was_ questioning the person who
made it.  That person hasn't seen fit to answer my questions so I'm
starting to lean to the conclusion that the figures were plucked from
the part of his anatomy that he sits on.

> > > Especially if you would bounce the e-mail, you'll win this way...
> > 
> > You should not bounce SPAM once you have accepted it for delivery.
> 
> And that is, why you win when you'll reject the spam instead of bouncing it.

Bouncing the spam after you've accepted it is not an option.
Essentially, the choice you have is either to reject at smtp time or
accept the mail and send it to /dev/null.  The difference is of no
practical significance and especially so when weighed against other
(better?) options for dealing with spam that have been mentioned
elsewhere in this thread.

> > SPAMMERS USE BOGUS RETURN ADDRESSES.  If you do, YOU become part of
> > the problem and the likely outcome is that either some innocent third
> > party finds it in her inbox (which may well have been flooded with
> > bounces from elsewhere) or your mail queue fills up with MAILER-DAEMON
> > messages that keep retrying until they time out several days later.
> > You may even end up bouncing the spam to yourself, but then, that IS
> > entirely your own problem.
> 
> You accused me of not reading your message, but I have the feeling now you
> didn't read mine...

I did _not_ accuse you of not reading my message.  I _did_ ask you
to _re-read_ three of my paragraphs because it seemed to me that you
had overlooked or misunderstood what I had written.

I'm sorry if you have taken offence at anything I've written but I
can assure you that non was intentionally directed at you.

-- 
David Hart <debian@tonix.org>

Reply to:

References:
- Re: How does Cron send email?
  - From: Joe <joe@jretrading.com>
- Re: How does Cron send email?
  - From: Greg Folkert <greg@gregfolkert.net>
- smtp time spam filtering (was: How does Cron send email?)
  - From: David Hart <debian@tonix.org>
- Re: smtp time spam filtering
  - From: Greg Folkert <greg@gregfolkert.net>
- Re: smtp time spam filtering
  - From: David Hart <debian@tonix.org>
- Re: smtp time spam filtering
  - From: Andrew Sackville-West <andrew@farwestbilliards.com>
- Re: smtp time spam filtering
  - From: David Hart <debian@tonix.org>
- Re: smtp time spam filtering
  - From: Matus UHLAR - fantomas <uhlar@fantomas.sk>
- Re: smtp time spam filtering
  - From: David Hart <debian@tonix.org>
- Re: smtp time spam filtering
  - From: Matus UHLAR - fantomas <uhlar@fantomas.sk>

Prev by Date: Re: OT: a dumb query? pls humor me
Next by Date: Re: smtp time spam filtering
Previous by thread: Re: smtp time spam filtering
Next by thread: Re: How does Cron send email?
Index(es):
- Date
- Thread