[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: More on spam

on Sun, Oct 19, 2003 at 11:37:05AM -0600, Paul E Condon (pecondon@peakpeak.com) wrote:

> I like this suggestion. I know I don't know a lot about what spam
> really is.  I sense from reading this thread that others also don't
> know a lot. Some do, but many don't. So research that results in firm
> numbers about the nature of the problem is clearly a good thing. 


> One addition to Karsten's questions/issues: 
> It has been claimed that one person's spam is another person's ham. To
> what extent is this actually true? 

In my experience, not much, though there's some disagreement around the
margins.  Given the loads of viral email (which I consider spam) this
summer, UCE spam itself is only a fraction of all the junk mail I
receive -- maybe half of it now, and less than 1/10 during peak Swen and
SoBig attacks.

Brad Templeton's definition of spam is useful, though not universally

    I define E-mail abuse to be mail that meets all three of these criteria:

    1. It is unsolicited
    2. It is part of a "mass mailing." (bulk mail)
    3. The sender is a stranger to the recipient. (The recipient has
       never had wilful personal contact with the sender.) 

I would add the one additional condition:  it is undesired.

My own list of mail considered spam includes:

  - UCE
  - Autoresponse messages targeted on the basis of spoofed headers.
  - Viral mail
  - Mail received from sources after explicitly requesting no further

> Or is this just obfuscation by the advocates of spam? If we had
> collections of ham and spam that have been accumulated by different
> users with different filter set ups, we could look for overlap and
> disjointness of sets.  Or just run one person's spam thru another
> person's filter. Lots of opportunities for useful statistical studies.

One of the advantages of personally accountable Bayesian classifiers is
that spam definitions become highly standardized to the recipient.  You
can "train" your classifier with a corpus of spam and ham mail.  It is
practical to use such filters at the ISP level.

> But, a question: To what extent is it possible to trace a spam message
> back to its human originator? Is the 'envelope from' really reliable?
> What sort of data can/should be used to convict a 'perp'?

There's too much reliance IMO in technical measures here.  It's similar
to the fixation on clickthrough rates in online advertising (and shares
weaknesses).  Just because there is an available technical method
doesn't mean it's useful or even the best solution.

The one relationship which _can_ be established is the originating IP.
Consistently spamming IPs can and likely should be blocked at individual
and aggregate levels.  ISPs should take action to isolate such boxes.
Which is all well and good.

If you want to trace spam to products, the best strategy would be
"sting" accounts established to receive spam, purchase products or
otherwise respond, and then prosecute the vendor(s) who are benefitting
from spam, as well as the advertising services they use.

Most of which is OT for d-u.


Karsten M. Self <kmself@ix.netcom.com>        http://kmself.home.netcom.com/
 What Part of "Gestalt" don't you understand?
   GNU/Linux web browsing mini review:  Galeon.  Kicks ass.

Attachment: signature.asc
Description: Digital signature

Reply to: