[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

email cleanup (was Re: Deb-List Subject Line Tag?)



On Wed, Feb 12, 2003 at 04:30:37AM -0600, Gary Turner wrote:
[...]
| And to think, I've been wanting the other lists I subscribe to to lose
| the habit.  It's a serious waste of high dollar screen real estate.

I solve that using a filter.  The gist of it is this :

---
#!/usr/bin/python

import email
import sys
import re

# first parse out the message itself
try :
    # Be aware that this reads the entire message into memory at once.
    data = sys.stdin.read()
    message = email.message_from_string( data )
except :
    # if we can't parse the message (eg malformed MIME), leave it alone
    sys.stdout.write( data )
    sys.exit( 0 )
del data

subject_patterns = (
    ( r"\[6bone\] " , "" ) ,
    ( r"\[Aegis\]" , "" ) ,
    ( r"\[Exim\] " , "" ) ,
)

## ----------
# clean up the Subject: header
Subject = "Subject"
if Subject in message :
    subj = message[ Subject ]
    subj_orig = subj
    for pat , repl in subject_patterns :
        # collapse "Re: [Foo] Re: [Foo] bar" to "Re: bar"
        subj = re.sub( r'^[ ]*(?:R[Ee]:)?[ ]*'+pat+r'[ ]*R[Ee]:[ ]*' , r'Re: '+repl , subj )
        # remove other instances of "[Foo]"
        subj = re.sub( pat , repl , subj )
    # Only replace the subject if it changed.  That way the order of the headers
    # is maintained if it isn't changed.
    if subj != subj_orig :
        del message[ Subject ]
        message[ Subject ] = subj
## ----------

# return the shiny clean message on stdout
sys.stdout.write( message.as_string( unixfrom=False ) )
---

I actually do a bunch more in the script such as removing munged
Reply-To: headers, removing pointless MS or AV headers, stripping list
trailers/ads, and removing some "privacy disclaimers".

The script requires python >= 2.2 or a local installation of the
'email' package.

Since I am currently using maildrop for local delivery, I use it's
"xfilter" command to run list messages through the cleanup script.
Some messages, such as cvs notifications and "personal" messsages, I
wouldn't trust a blanket cleanup to not accidentally destroy.

Maybe some day I'll release it publicly, but it needs a lot more
refinement before that would happen.

-D

-- 
Come to me, all you who are weary and burdened, and I will give you
rest.  Take my yoke upon you and learn from me, for I am gentle and
humble in heart, and you will find rest for your souls.  For my yoke
is easy and my burden is light.
        Matthew 11:28-30
 
http://dman.ddts.net/~dman/

Attachment: pgpcozu0Kj33_.pgp
Description: PGP signature


Reply to: