[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bits from the listmaster team



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi everyone,

the listmaster team is constantly trying to improve the setup of our
listserver. Thus, quite a few things have happend since our last update
in September of last year. Here are some highlights:


lists.debian.org moved to a new hosting location
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lists.debian.org has been moved to man-da[2] to avoid several problems
with DNSBL pointing to our list server. We'd like to thank Brainfood
for hosting lists.debian.org and dealing with the insanity of spam
reporters for so long. After the move of the list service to the new
machine, we also decided to move the list archives to that machine
(which means the list archives are on the same machine as the MX, and
consequently suffer fewer delays).

If you haven't already, please add lists.d.o's new ip, 82.195.75.100,
to your whitelists.


New list archive search engine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With the move of lists.debian.org to the new hosting, we took the
opportunity of deploying a new search[3] based on Xapian Omega. The
index comprises some 3.5 million messages, approximately 100k of which
are estimated to be spam.  In order to improve your search experience,
we have prepared the indexing software to benefit from our spam removal
plan (see below).

As before, searching by author and list is supported, but the new
search should be more language- and encoding-aware. Work is in
progress to provide our adaptations to upstream and implement
improvements based on our experience and the hints we got from the
friendly people at Xapian.


Config cleanup
~~~~~~~~~~~~~~
The config cleanup is another big project which seems to turn into an
ongoing task. Since the last update we decided to unify some global
files for all lists, and move all list specific config to extra files.
(This follows the layout the inventors of smartlist, our mailinglist
software, had in mind.)

We also want to move some information like moderation status or maximum
mail size per message to a global file, which is also used by the
listarchive and some more informational or statistical tools.

To check if lists are configured correctly we subscribed an address to
all 182 mailinglists and checked back a month later for ham/spam-ratio,
and other anomalies. We found some wrong spam-rules, which led to some
false positives and other 'backdoors' which bypassed some of our
spamrules, which lead to false negatives.  We also found lists which are
supposed to carry only informational mails from an automatic system, so
we could tighten the rules, and on the other hand we could drop the
usual spamfilters for those lists, so distribution gets faster and we
need less CPU/memory ressources to get one mail through.

We also implemented the usual 'Precedence' and 'List-*'-headers on all
lists (we had some lists where those were missing) and automatic
responses, so we are now a little more net-friendly with our service.

While reviewing things we found that our bounce handling had some
issues, see the next section for information about that.

Better bounce handling
~~~~~~~~~~~~~~~~~~~~~~
We checked our bounce handling because we have more than 500 bounces
for some lists, and in the process found that we didn't have working
bounce handling for other lists (other-*, deity, *-digest,
debian-private). There were also problems in handling and recognizing
mailadresses containing = or ! characters.

Bounces of debian-private subscription are still manualy handed by the
listmasters, but we now address these issues and forward such addresses to
da-manager@debian.org.

To address the other mailing lists we rewrote some parts of our
bounce handler.

While analysing the bounces streaming in, we found that a lot of bounces
are caused by content filters which reject listmail back to us (which
violates the RfC). Even worse: the majority of those are false
positives.

To let those people know we'll implement a notification system, which
will notify users about bounces, and remind forcibly removed users about
their unsubscription.

This is a service for those people with a temporarily unavailable or
broken mailbox, so they see that they (or their provider) has a broken
mail setup or resubscribe back to all lists after their mailaddress is
functional again.  These notification will be sent out at a maximum of
once a week, up to a month after the last unsubscription happened.

Both notification systems are in testing now and will be activated
shortly after this mail.


List archive spam
~~~~~~~~~~~~~~~~~
As avid followers of debian-project will know, we have implemented
support to weed spam out of the www list archives. While we want to
get rid of as much spam as possible our paramount objective in this
effort is to preserve the integrity of the archive (e.g. keeping URLs
constant for past messages and avoiding removal of non-spam mail). This
means that the submissions we receive from users clicking on the
spam-report button of the list archive must be verified manually and
each nomination has to be checked by independently by several people.
Some 1000 spam messages have been deleted from the archives of
debian-java, debian-project, debian-python, and debian-vote.
To help out or learn more please visit our wiki page[4].

How to help
~~~~~~~~~~~
You can help us in a few important areas:

 * Spam rules -- If you notice spam getting through the spam filters,
   and have ideas for improving our filters, we accept patches to our
   rulessets, which are publicly available via svn.[5]

 * Encoding issues -- If you notice encoding problems of messages sent
   after November 2007 in the archive, please contact
   listmaster@lists.debian.org with a link to the problematic message
   and an explanation of the problem.

 * Avoid bouncing spam -- If you don't want your MTA to accept spam,
   please just discard it instead of 550'ing, at least when a message
   comes from liszt.debian.org

 * Troubleshooting -- If you notice a problem with a message that
   you've sent to a mailing list which hasn't arrived, please provide
   us with as much information as possible, including Date/Time (UTC),
   From, To, Message-Id, delivering IP, and the logfile entries from
   the delivering host.

[1] http://www.brainfood.com
[2] http://www.man-da.de
[3] http://lists.debian.org/search.html
[4] http://wiki.debian.org/Teams/ListMaster/ListArchiveSpam
[5] svn://svn.debian.org/svn/pkg-listmaster/trunk/spamassassin_config
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHzW3RriZpaaIa1PkRAtaAAKC6kn/X3tsmK/b6dw12VldHp2LlBgCfQKyE
bJatArEUPXyHQ+Tt9TZlrNI=
=Z508
-----END PGP SIGNATURE-----


Reply to: