Re: MailMan VERP
to, 2003-10-23 kello 23:52, Graham Wilson kirjoitti:
> On Thu, Oct 23, 2003 at 02:01:28PM -0600, Joel Baker wrote:
> > Of course, the next round is likely to be based on Postfix + an
> > ezmlm-workalike (assuming I can find a satisfactory one), so we'll see
> > what things look like then. :)
> 
> I made the switch about a year ago, and I use [1]enemies-of-carlotta. It
> supports bounce-handling similar to VERP, but I can't find the details
> at the moment.
I should probably document this somewhere, if I haven't already. In
short, the Enemies of Carlotta algorithm is as follows:
1. Subscribers are kept in groups. A group can be of any size. When
sending out mail, each group gets a separate copy, using VERP. This way,
we don't send out multiple copies of mail needlessly.
2. If any bounce comes in and it refers to a group with more than one
address, the group is split. A group with addresses from more than one
domain is split into smaller groups with one domain each. A group with
only one domain is split into individual addresses. This way, if an
address is permanently broken, we eventually learn which one it is.
3. If any bounce comes in and it refers to a single address, that
address is marked as bouncing.
4. Periodically (using cron), Enemies of Carlotta checks which addresses
are bouncing. Those that are bouncing get a probe after a week from the
first bounce; again, this is sent using VERP. If the probe bounces, the
address is removed (a warning is sent to the address, just in case). If
the probe hasn't bounced within a week, the address is removed.
5. During the periodic check, Enemies of Carlotta also joins groups that
are older than one week and aren't bouncing. This way, addresses without
problems tend to gravitate into one large group.
In other words, this scheme is approximately the Ezmlm one with the
addition of address grouping to reduce the number of outgoing messages
due to VERP. The grouping idea was suggested in an IRC discussion on a
channel for Debian developers, but I forget by whom.
I'm sure the grouping can be made more effective, but that requires more
testing with large lists in the real world. For small sites, such as the
ones I run myself, even using a single address in each group is
typically not a problem, bandwidth-wise. For large sites, such as
lists.debian.org, a group splitting scheme with more levels and steps
and better heuristics may save a lot of bandwidth - or it might not,
this is an area for research for someone.
Enemies of Carlotta does not require MTA support for VERPs. In fact, it
likes to construct its sender address itself so as to be able to know
what it is and to embed a cryptographic hash for detecting forgeries. 
I haven't studied how Mailman deal with bounces, but if it does it based
on a single monthly mail, it may be unsubscribing people too hastily. It
is somewhat common, for example, that a DNS glitch or temporary hardware
failure causes mail to bounce for a day or two, so it is not good to
unsubscribe people based on a single bounce - or to use any other
counter for bounces. A time based system is, in my opinion, much
preferable. But I repeat, I don't know what Mailman does, and they well
be using a good system.
Further discussion about Enemies of Carlotta is probably best done on
its own mailing list; see http://liw.iki.fi/liw/eoc/ for subscription
instructions. The package name is enemies-of-carlotta, if you want to
test it. It works on stable, too, even if it is only included in
unstable; I use the .deb on my mail server, which runs stable, together
with Postfix.
-- 
http://liw.iki.fi/liw/photos/swordmaiden/
Reply to: