[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Two copies of E-Mail (Re: I wish to advocate linux)



Bob Proulx grabbed a keyboard and wrote:
> David Guntner wrote:
>> Anyway, the recipe is dirt simple.
>> ...
>> # Duplicate Suppression.
>> :0Whc: $MAILDIR/.msgid.cache.lock
>> | $FORMAIL -D 8192 $MAILDIR/.msgid.cache
>>
>>         # Take out the Trash.
>>         :0 a:
>>         /dev/null
>>
>> That's all there is to it.  The formail program is used to grab the
>> Message-ID of the incoming message.  Even if it is sent To: one address
>> and CC: another, both "copies" will have the same Message-ID.  When the
>> first one comes in, it stores that ID in the $MAILDIR/.msgid.cache file
>> after first comparing the message to see if that ID has already been
>> stored there.  If not, then it stores the ID and returns a FALSE so that
>> the second part ("take out the trash") won't process.  If the Message-ID
>> already *has* been stored in the cache file, then it returns a TRUE and
>> the second part then dumps the message into /dev/null.
> 
> If it works for you then great.  But this is not without problems for
> others.

Yea, it works swell.  I've never had any problem caused by it, either. :-)

> For one I use the mailing list headers List-Id and List-Post.  Those
> are the standard headers and those are the best ones to use for filing
> mailing list messages.  Smart MUAs use those to know how to do a
> list-reply.  Therefore the copy I want is the copy that comes from the
> mailing list.

Not every MUA does, however.  The one I'm using, for example, does not
(or if it does, I've never figured out how to turn that feature on...).
 Therefor, I've also got a Procmail recipe that adds a Reply-To:
pointing back to the list on my local copy (of debian-user, since it
doesn't add one itself - on lists that do so, I don't use that rule) so
that when I hit reply, it goes back to the list as it should since most
of the time a reply should go back to the list when replying to a
posting on the list.  And I don't want to have to remember to do it
manually each time I reply. :-)

> When people CC me directly then the direct copy is almost always the
> one that comes first.  The recipe above deletes the second one.  The
> second one is usually the one that comes through the mailing list
> because this mailing list is sending to 2,000+ recipients and
> therefore it takes longer.  The above almost invariably discards the
> mailing list copy that I want to keep and keeps the direct copy that I
> want to discard.

Almost always, yes.  But the various oddities of E-Mail processing
doesn't result in that being a 100% occurrence.  I've seen it happen,
myself.  (And you seem to be acknowledging that as well.)

> This means that people who use the above can't use the standard
> List-Id headers and instead try to use the To: or Cc: addresses to
> file the messages.  Or worse they try to use a Subject: tag.  That is
> bad because the List-Id header is there for just that purpose.

They're there to help people delete a second copy of the same message
when someone sends to both a mailing list (or other method) and to that
person?

> Thankfully debian-user doesn't use a subject tag.  But it is a chain
> of circumstances such as this that cause people to make bad choices
> and then often try to force those bad choices upon others.  How many
> times have people asked for subject tags on random mailing lists
> instead of using the List-Id header as it was intended?
> 
> At one time I used the above recipe myself but stopped using it due to
> these problems.  YMMV.

My mileage is fine, given that the goal of the goal of the above recipe
is to eliminate a duplicate message, not figure out where to filter the
list message into which folder.

It all depends on your experiences and own requirements.  I for one am
on a decade+ old list that was "home grown" - the guy running it "rolled
his own," so to speak.  It doesn't use a subject tag, and it has never
had those now-standard List-ID headers, nor is it likely to anytime in
the future.  So even if I *were* using a MUA that understands those
headers, it would do me no good.

It has never occurred to me to ever filter based in a List-ID field,
since back in the "old days" when I started doing this, they hadn't yet
come into existence. :-)  And even *after* coming into existence, you
still have to *send* your message to the list in question, thus the To:
or Cc: will *always* be there, regardless of the presence (or lack
thereof) of a List-ID header.  Also, by filtering on those (To, Cc), it
works 100% of the time - even if the above recipe deletes the list copy
if it came in second. :-)

For myself, this is what I use specifically for the Debian users list:

> #### Debian Mailing List Handling ####
> # They don't set a Reply-To: header pointing back to the list....
> # So let's add one for the discussion list (only)
> :0fhw
> * ^TO_ .*debian-user@lists.debian.org
> | $FORMAIL -i "Reply-To: Linux Debian Mailing List <debian-user@lists.debian.org>"
> 
> # All mail (user, security, etc.) into one folder, please!
> # Look for the list address here and put them in their own file
> :0:
> * ^TO_ .*@lists.debian.org
> $MAILDIR/debian/
> 
> #### End of Debian Mailing List Handling ####

Since TO_ represents:

> (^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):(.*[^-a-zA-Z0-9_.])?)

It will pretty much catch the string being looked for if it shows up
*anywhere* in the message headers. :-)  Since I've never filtered based
on a header which may-or-may-not be there, deleting the second,
duplicate copy of a message has never caused a problem even if that one
was the list-processed copy.

In fact, I would argue that using the above filter (TO_) is *less*
problematic than the method you use, since deleting a duplicate
Message-ID does have the potential to remove the copy that actually went
through the list - it doesn't matter which one got to you first, since
it *still* gets filtered into the correct folder.

But again, it's all a matter of personal taste, personal experiences and
personal requirements (like I said, I'm on a really old mailing list
which has never had List-ID headers and most likely hell will freeze
over before it gets them; the list has been around longer than the RFC
which defines List-ID).

>> Note that the locks *are* critical to prevent corruption, so keep the
>> trailing ":" characters where they are. :-)
> 
> Note that they aren't needed for the /dev/null rule.  :-) However I
> vaguely recall that /dev/null is treated specially and therefore it
> won't matter for it one way or the other.  I would still remove the
> lock colon for /dev/null anyway since it isn't needed with /dev/null.

Agreed, the lock on the part filtering into /dev/null isn't needed
because of what it is.  It doesn't hurt anything by being there, either.
 My own preference is to keep it there for consistency, since for other
rules filtering into a real folder the lock is a good idea - it keeps me
in the habit. :-)  Also, any time I put in a new rule that's eventually
going to /dev/null mail, I like to test a while first by sending to a
real folder so I know what it's doing before "throwing the switch," so
at that stage the lock is important.  Once it goes to /dev/null it
doesn't need it (as you've noted and I agreed), but it also hurts
nothing to leave it there.

> Bob
> 
> P.S.  Here is the procmail rules I use to file all Debian mailing list
> messages.
> 
> :0
> * ^List-Id: .*<debian-[-a-zA-Z0-9]+\.lists\.debian\.org>
> * ^List-Id: .*<debian-\/[-a-zA-Z0-9]+
> Lists/debian/$MATCH/
> 
> :0
> * ^List-Id: .*<[-a-zA-Z0-9]+\.lists\.alioth\.debian\.org>
> * ^List-Id: .*<\/[-a-zA-Z0-9]+
> Lists/debian/$MATCH/

That's great for filing (and cool to know about, for mailing lists which
include those standard headers).  How does it get rid of the dup when
someone  does a To: the list and Cc: the person on the list he's
replying to?  (Remember, I sent the above recipe because someone was
complaining about duplicate message, not that they didn't know how to
filter them into a folder - in essence, you've provided an answer to a
question that he didn't ask. :-) )

BTW, in your above example:  What is $MATCH set to, and where it it set?

               --Dave


Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: