[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Automated spam reporting?



on Sun, Apr 21, 2002, ben (benfoley@rcn.com) wrote:
> On Sunday 21 April 2002 03:16 am, Paul 'Baloo' Johnson wrote:
> > Are there any debianified tools to look up proper abuse contacts and
> > fire off spam reports automagically a-la spamcop?
> 
> i don't know of any debian specific tool for the job, but it sounds like 
> spamassassin might be the cure you're in need of. it used to be available, 
> allegedly, as a debian package but whenever i've tried to get it, i get 
> package unavailable messages. i haven't ascertained why. karsten is the best 
> resource for whippin' the spam vendors--next after dman, at least. hold out 
> for their response on spam vengenace methods.

With a few very strong cautionary notes, ricochet seems to be a useful
tool.  It's what I use.  I've only pissed off half my best friends with
it so far (only half kidding).

While ricochet doesn't _identify_ spam, it does a reasonably good job of
tattling about it in conjunction with other tools.  I use spamassassin
and a triggering level of 10 to autoreport, and manually report anything
at lower levels with a mutt hotkey (see the Ricochet README file).

The scoop:

  - It analyzes headers, and generates reports based on the hosts
    reported as relays and originators.  This means you *will* send
    reports to addresses associated with spoofed headers.  Responses to
    this can vary from grateful to robotic to "I'm going to file a
    police report on you if you send me any more mail" and repeated
    calls to several other office phones [1].  Owners of mailing lists
    you're associated with are likely to be less than amused at spam
    reports sent their way.

  - It has several sources to query for abuse reporting information,
    including a local "abuse-contacts" file, whois queries (which are
    cached locally), and abuse.net lookups.  You can adjust where abuse
    reports are sent for specified domains with this file.

  - It has a "skip-list".  Curiously enough it's a list of domains to
    which reports won't be sent.  Useful for not yourself spamming
    sources which will likely touch spam mail, but aren't directly
    responsible for originating it.
    
        ***   USE THIS LIBERALLY ***   

    You absolutely want to put in this file:

      - Your ISP.
      - *** ALL *** mailing lists you're on.
      - (Possibly) your employer.
      - (Possibly) major clients / accounts.
      - (Possibly) friends who're apt to send spamish mail, joke lists,
	etc.

    I can't overstate the importance of adding mailing lists to the skip
    list.  Few lists can accomodate filtering spam.  List admins would
    be overwhelmed by responses from even a small fraction of list
    members to the spam that does come through.  Most will respond by
    dropping subscribers who do this (or threatening same).  Even
    a short-term or small number of slips can be _really_ annoying.  So
    get this right, and keep the skip-list updated.

  - There's an options file.  My own settings are as follows:

	AC: 1
	NOSEC: 1
	ABUSENET: 1
	DONT_SEND: 0
	DEBUG_ON: 1
	GUESS: 1
	INTERACTIVE: 0
	BACKGROUND: 1

    ...which is to say -- consult the abuse-contacts list, *don't* do
    secondary lookups, don't don't send (e.g.:  do send [2]), turn on
    debugging logging.  I manually added a timestamp to ricochet as it
    fails to note this, lines 602-604:

       # KMSelf Mon Apr  1 11:50:22 PST 2002:  Add a timestamp
       $now = localtime;
       $self->debug (1, $now);    

    Ricochet will also guess at response addresses (abuse@,
    postmaster@), don't run interactive, and background the process
    (makes it somewhat less painful when run from a mailer).




Disadvantages
-------------

There are several.  

First, ricochet is pretty naive.  It would be very helpful to specify
_only_ non-spoofed headers be responded to.  This is difficult to do,
but a good first approximation would seem to be using the IP of the
first mailhost not listed in the 'skip-list' file.  This would be a host
which (A) isn't generally known or trusted, and (B) has its IP being
reported by a host which likely is -- this information is likely to be
valid.  While not the source of the problem, it is certainly
contributing.  And if it's not your ISP's mailserver or a mailing list
server, it's got no business doing same.

I'd also like to see some level of memory incorporated into tools like
this, preferably at the netblock level.  This is information that's hard
to get given heterogeneity of WHOIS records, but could be useful in
identifying netblocks with notable spam problems.  Also identifying
the level of traffic from netblocks that is or isn't spam would help
guard against blocking a netblock simply because it's so large that it
will be responsible for an appreciable amount of spam.  One could argue
various ways on this, but it's simply impractical to blackhole *any*
arbitrary domain for *any* spam.

The third point is that there are some decidedly unresponsive domains
out there.  Some means of tying together:

  - Spam detection -- filtering messages detected as spam.
  - Spam reporting -- sending notifications to responsible networks.
  - Spam response -- disposition of complaints from a given netblock.

...would be most cool.  A response might or might not be acceptable.
For site which are consistently nonresponsive and high-level spam
sources, a way of building data into server-level blocking rules would
be very useful.  This means capturing the spam, reporting it, keeping
track of responses (which may or may not include any of the report's
subject or body), and tracking additional mail from the same source.
From this, a set of rules applicable to a given MTA (or procmail rules)
is forumlated.  An integrated system, nontrivial.  But the component
pieces are there.

Peace.

----------------------------------------
Notes:

1.  Ghod's truth.

2.  Memo to utility writers.  Don't use negation options.  Specify
    positives (either "send" or "suppress" might be a better option
    here).  The "don't not send" syntax is unweildy at best.

-- 
Karsten M. Self <kmself@ix.netcom.com>        http://kmself.home.netcom.com/
 What Part of "Gestalt" don't you understand?
   Reading is a right, not a feature
     -- Kathryn Myronuk                           http://www.freesklyarov.org

Attachment: pgpJhHisCYhcj.pgp
Description: PGP signature


Reply to: