[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: More on spam

on Fri, Oct 17, 2003 at 02:23:39PM +0100, Colin Watson (cjwatson@debian.org) wrote:
> On Fri, Oct 17, 2003 at 05:56:26AM -0700, Paul Johnson wrote:
> > On Fri, Oct 17, 2003 at 05:36:40AM -0700, Tom wrote:
> > > What does this have to do with spam?  It bemuses and befuddles me to
> > > observe extremely intelligent people to swatting the air with tools
> > > like spamassassin, when the correct solution lies elsewhere.  The
> > > correct solution is to merely enlighten all of humanity not to send
> > > spam.
> > 
> > Spamassassin is one of many tools to do this.  Simply using
> > spamassassin to delete your email is not going to get the job done.
> > You have to follow through with other means to get the spammer's
> > webhosts and email providers involved to cut them off.
> If only this were practical for the volume of spam bugs.debian.org gets
> (2Gb caught by spamassassin in the last two weeks). We just don't have
> the manpower even to make a dent here. :-(

My response then would be that throwing manpower at the problem is the
wrong thing for a number of reasons:

  - Debian is a volunteer project.  Manpower is always in short supply,
    and throwing it at this pulls it from other tasks.

  - Responding to spam isn't particularly fruitful.  It doesn't leverage
    itself meaningfully.

  - There are other ways to reign in the problem and/or raise costs.

I'd recommend the following approaches:

  - Keep stacking on the filters.  Automated measures do seem to work,
    and can be leveraged -- *everyone* has a spam roblem.

  - Run and keep stats on spam.  There are several dimensions which are
    interesting, among them:

    - Relative amounts of spam vs. ham.

    - Origins by nation.

    - Origins by network.

    - Origins by service classification (fixed IP, dynamic, DUL).

    - SA (or other classifier) scores on ham.

    - SA (or other classifier) scores on spam.

    - Top originating IPs for ham.

    - Top originating IPs for spam.

    - Frequency of occurence trends for ham.  For a given mailserver,
      how many messages are received, say, weekly, classed as ham.  I
      expect that a reasonably small number of servers will originate a
      large amount of mail, and a larger number will originate a smaller

    - Frequency of occurence trends for spam.  For a given mailserver,
      how many messages are received, say, weekly, classed as spam.  I
      expect that a small number (smaller than the first group above)
      will originate a moderate amount of spam, and that most spam will
      originate from previously unknown servers.

    - Spam/Ham mix by server.  I suspect you can pretty much classify
      hosts as hammy or spammy, with some being moderately grey.

  - I see future directions in spam management being making mailservers
    much more intelligent about the hosts they receive mail from.
    Typically good hosts will get preferential treatment.  Bad hosts
    will be dropped.  Previously unknown hosts will get serviced but
    only after some razzing.  Advertising different MX hosts to known
    and unknown query origins, and hosting these on different nets with
    different service levels is also likely (this is a modification of
    Brad Templeton's current "best plan" for spam.  The net result is
    that good transmitting MTAs get priority access, bad MTAs don't
    steal resources, and are themselves forced to pay through time or
    other resource costs to send mail -- but all in a way that's
    compatible with current SMTP protocols.


Karsten M. Self <kmself@ix.netcom.com>        http://kmself.home.netcom.com/
 What Part of "Gestalt" don't you understand?
    Bush/Cheney '04: Leave no billionaire behind

Attachment: signature.asc
Description: Digital signature

Reply to: