Re: Anti-Spam ideas for usenet/list harvested email addresses
(my reply is a bit disjointed, since I put things inline, and jumped
around while crafting my response...sorry for the nonlinear thinking
pattern)
Jacob Anawalt wrote:
To me the big question is how do I avoid the spam in the first place,
besides avoiding email all together? I want to participate on the web, I
just don't want so much junk email nor do I want to have my mailbox or ISP
suffering from gigabytes of worm attachments or advertising data.
Your ISP should be filtering worms. It's fairly easy to do. If they
don't want to bother with setting up a virus filter, hard drive space is
fairly cheap. In addition, it would be nice if more ISPs filtered
outgoing email as well. That's not always practical, and it won't stop
the latest worms which sprechen SMTP, but it could help.
We've all done or seen people do this: jacob at cachevalley dot com,
jacob.nospam@cachevalley.com, jacob@cachevalley.nospam.com, etc.
Are we kidding ourselves thinking that if we can write a filter rule that
just catches SoBig.[A-Z], that someone else can't turn all of those 'safe'
addresses back into the real email address?
Spammers don't really care either way... look to the dictionary attack
type of spammers for an example...("well, I've seen a
jacob@some.company.com, so let's try "jacob@cachevalley.com" as well).
The problem with turning a "safe" email address into a real one isn't a
big deal, it just protects against the "dumb" harvesters. It's like
using The Club on the steering wheel of your car... it won't defeat an
experienced car thief, but it may convince him to skip your vehicle.
In the case of a mailing list, I fail to see any advantage in the
obfuscation of your email address, since it's present in the header. The
exception would be private versus post-only addresses, as you mention below.
I've already mentioned the web authorization idea and the rotate your
email address on some schedule ideas in another thread. I've even seen a
web site go so far as to use a .js file function to put together the email
address from a bunch of fragments when you click the mailto link. That
would take more work to parse, but it is still possible by having an email
grabbing webbot that can run javascript.
So it's just a slightly more complicated version of "jacob at cache
valley dot com".
What's more fun is making a website that creates endless lists of bogus
email addresses (like http://www.all-yours.net/scripts/killspam.htm) to
get address harvesters to puke.
Another though I've had on the mailing list issues (besides wondering why
I'm trying to make mail act like a news client with threads and looking
for a 'watch thread' capable client) is if I had an email address to use
on mailing lists that only accepted email from the list servers I was on
and reject all others I should only get the spam that relayed through the
list.
The mail server would need to have access to my personal list of
acceptable email addresses so it could give a 550 with the appropriate
extended SMTP code for unauthorized/security and an appropriate error
message after the HELO and MAIL FROM and RCPT TO: have been given. It
should only do this for mail accounts that have entries in the safe list.
If your list is empty, all email is valid. If you have one or more
entries, only those ones can send you email.
So in practice, the idea would work something like the following?
1) Create a "Debian-user only" address, which you'd use for posting to
debian-user.
2) Email to the debian-user only address must come from the debian
mailing list, or I'm going to SMTP-reject it, since it's probably from a
spammer.
Some ideas for rules to accept or reject the email may include:
If HELO does not match a reverse DNS lookup and doesn't match the domain
of RCPT TO: or to a user specified value then the mail is rejected.
In general, this will reject legit mail. In particular, sites that host
for more than one domain will not have a reverse DNS matching what you
might expect.
If only applied to a particular mailing-list, it might work, though.
Perhaps even IP address would be fine (debian-user-jacob emails must
come from a server with reverse DNS of murphy.debian.org). Note that you
cannot trust reverse DNS, though, so a forward lookup would also have to
be done.
A looser match would be just on the HELO <name> where the name given is
some md5hash of the user's email address and some value noted on the
mailing list. People start getting spammed, the list admin changes the key
used to generate the name value and people go to the web to see what it
has been changed to.
So the MTA on the Debian mail server, for instance, would have to be
modified to generate a custom HELO for every message? This would really
hurt for larger sites which have more than one recipient to a mailing
list message...
A tighter setup might be to have the hash in the MAIL FROM: <value> and
have it be a hash of the subscriber's list password and their email
address. That way the subscriber can change their list password at any
time they see spam coming “from” the list.
But for most mailing lists, MAIL FROM: is the sender's email address. To
change that would require modifying the mailing list software to break
the header, or modifying everyone's mail client. Again, this could get
ugly for sites with multiple subscribers to popular mailing lists.
I'm sure there are other better ideas to be had along the lines of how to
quickly identify that the sending server is who they say they are and look
up a safe list to see if the user accepts email from that server.
For a dead simple solution, set up a subdomain like
@lists.cachevalley.com, and run a MTA dedicated to list traffic. Using
existing SMTP access control, deny all access except for the IP
addresses of servers you communicate with, and internal servers.
You could even whitelist additional entries, perhaps by automatically
scanning the mailing lists and (temporarily?) adding IP addresses of
recent posters.
A side benefit of using an email address that only accepts list traffic
for some would be that it would reject the second email if someone replies
to you and the list. People using this setup could have their .sig say
"This email address only accepts authorized list traffic, please reply to
the list."
A simpler way is just make up something like
"jacob-debian-list@cachevalley.com" as an email alias for yourself.
Then, have procmail dump messages ^TO: that address into a folder,
unless they do not come from murphy.debian.org, or something like that.
You probably don't want to automatically delete them. You also probably
don't want to tie it into the MTA, just in case something breaks down
the line.
Since we have seen that a greater volume of worm mail is possible with
email addresses usenet and mailing lists, it seems a setup based on this
system could help cut down the cost of fighting spam generated from those
sources. The rules would be based on a simple lists, with each user
responsible for maintaining their list. Much less CPU power, bandwidth and
storage space would be required to match those rules because the matching
is done before delivery is accepted. Mailing lists could publish to their
subscribe page the values they use for HELO and MAIL FROM when sending the
messages to all subscribers.
I'd differentiate between worms and spam more clearly. Worms/viruses are
fairly easy to keep up with, in that daily updates of your anti-virus
program will result in capturing virtually all viruses/worms with
virtually no false positives. Plus, you'll catch direct client to client
mail, instead of just mail to addresses harvested from mailing lists.
Compare this to the "dog chasing cars" method of inventing a new filter
rule that looks through the MIME data to decide if this is the latest worm
you don't want or the kissing picture that you do. Sure it's cool to be a
geek and figure out the rules. If you like doing this, do it. Maybe spam
isn't a cost to you but a benifit if you consider your enjoyment at
solving each filter puzzle. I think that's why I like finding bugs, to
help find and solve puzzles. On the other hand this method of filtering is
more expensive in every measure I can think of except the freedom of
allowing anyone to email you anytime. You spend time thinking up rules,
writing rules and testing rules. The rules are applied after you have
accepted the bandwidth of the transfer. Running the rules takes CPU time
and possibly more bandwidth as you do RBL DNS or Razor and storing the
email takes disk space.
Again, there's a big difference between catching worms and catching
spam. clamav's auto update ensures that my Amavis will catch just about
everything worm related.
If you're sick of getting swamped (as a user or admin) wouldn't this setup
be usefull? An ISP could encourage users to use username.lists@isp.com for
email addresses that are going to be used on usenet or public mailing
lists. The new email address could just dump into the real address after
the mailing list rules were validated, or it could be it's own account and
mailbox.
Of course some will say "but I only have my ISP available and it doesn't
do that" and others will say "I don't like that idea because it isn't easy
or flexible enough. I want email from everybody as long as it isn't
UCE/UBE/A worm or virus". That's why there isn't just one way to do
things, we all have different ideas on what is best.
Good point. Important to keep the differing needs in mind when dealing
with end users.
One major concern that I've lightly touched on and will bring up again is
“What if I want to have other people contact me off list?” You wouldn't
want to post your non-list-only email to the list, that would be
counter-productive. There's got to be a convenient way of providing a
source for people to look up your email address that is very resistant to
scripting it's harvest for the UCE/worms/etc. One idea that comes to mind
are images of pictures with your email address on your web site. I keep
thinking that PGP/GPG should be able to help in some way, either by adding
to the EHLO command set or something on the users web site. There have to
be better and still simple ways of doing this that make it cost much more
to find our email addresses than it costs us to filter the junk.
True. But you still don't solve the problem of having someone easily
contact you off list. In the case of this email, I've decided I have
something worthwhile to say on the topic at hand (or I'm bored, and want
to babble about email filters...) so I hit "reply to all". If I had to
break my train of thought to sift through your website to find your
email address, I'm probably not going to bother. Also consider the fact
that some people do have to read email offline, and rely on the
assumption that all necessary contact info is contained in the email itself.
Enhancing EHLO would probably not be realistic, given that virtually all
email clients would have to implement it. It's like saying "oh, just
turn on SMTP authentication, and we can be sure that the sender isn't a
spammer, or at least can track them down".
Images with pictures of your email address is fine, but again, it's just
a slightly more difficult form of "jacob at cachevalley dot com"...
eventually wouldn't the spammers just create OCR software that looks for
email addresses in images on websites linked from your website?
The sad part is that I've already squandered my username at this email
address by putting it where it can be harvested in mass by worm/virus and
UCE/UBE collection scripts, and I had already read an article cautioning
me against this. Oh well live and learn (someday I'll learn anyway.)
I'm going to look into setting up a new email address with mail server
rules for delivery driven by a user supplied whitelist after waiting a few
days for comments and flames on this idea. If you know of links to pages
already discussing how to do this with postfix, please share them.
Look to SpamAssassin. That will make a huge dent in your spam problem.
Tack on Amavis for the latest in MS malware, and you're in business. I
believe both integrate fairly well with Postfix.
Amavis is also able to reject viruses during the SMTP transaction. This
I would agree with, if your configuration allows it.
Some good thoughts there... but I wonder just how many mailing lists
would need to apply such a solution to make an impact, and how difficult
it would be to apply. OTOH, you might find better results with simpler
methods...
--Rich
_________________________________________________________
Rich Puhek
ETN Systems Inc.
2125 1st Ave East
Hibbing MN 55746
tel: 218.262.1130
email: rpuhek@etnsystems.com
_________________________________________________________
Reply to: