[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

getmail Bug#633799: how this should be handled



Hi,

If OK to upload a patched revision for Bug#633799, I will be happy to
upload it targeting wheezy.

I need guidance on how to handle Bug#633799 for getmail with a newly
bumped critical bug.
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=633799

Here is the situation.

The mbox format historically has several variants.  
 http://docs.python.org/library/mailbox.html#mbox

getmail used python standard library as is so it was mboxo.
It has annoying shortcomings but people have been using it.

getmail documentation said getmail use mboxrd for mbox.b (which is
new improved version of mbox.)

Although mboxo is not the best technical solution, it is python choice,
I felt this is mere documentation bug as wishlist.

Since freeze, getmail upstream decided to use real mboxrd and applied
patch prompted by a bug reporter.  Also this bug has been bumped back to
be critical by the same bug reporter claiming "Serious data loss".  (It
is true that it causes some minor and ugly data loss which is not
pretty. But I understand people have been having quite relaxed attitude
on this type of data loss historically.)

Considering relatively small size of patch which is already applied by
upstream, can I upload a new Debian revision to unstable with this
mboxrd format change for getmail targeting wheezy. (Upstream diff for
this attached.  patch to be applied will be very similar as this except
for program version number.) 

Or do you think changing basic program behavior for the mbox storage at
this stage too late even if it is to match with the documentation.

Or not to touch package and leave this bug report as "important" or
"normal" so people will notice better but not RC.

Regards,

Osamu


diff --git a/getmailcore/__init__.py b/getmailcore/__init__.py
index d1eb458..d37dd3d 100755
--- a/getmailcore/__init__.py
+++ b/getmailcore/__init__.py
@@ -16,7 +16,7 @@ if sys.hexversion < 0x2030300:
     raise ImportError('getmail version 4 requires Python version 2.3.3'
                       ' or later')
 
-__version__ = '4.34.0'
+__version__ = '4.35.0'
 
 __all__ = [
     'baseclasses',
diff --git a/getmailcore/message.py b/getmailcore/message.py
index 0137ba3..0e54ef8 100755
--- a/getmailcore/message.py
+++ b/getmailcore/message.py
@@ -10,6 +10,7 @@ __all__ = [
 import os
 import time
 import cStringIO
+import re
 import email
 import email.Errors
 import email.Utils
@@ -29,6 +30,9 @@ message_attributes = (
     'recipient'
 )
 
+RE_FROMLINE = re.compile(r'^(>*From )', re.MULTILINE)
+
+
 #######################################
 def corrupt_message(why, fromlines=None, fromstring=None):
     log = getmailcore.logging.Logger()
@@ -130,19 +134,25 @@ class Message(object):
         it by writing out what we need, letting the generator write out the
         message, splitting it into lines, and joining them with the platform
         EOL.
+        
+        Note on mangle_from: the Python email.Generator class apparently only
+        quotes "From ", not ">From " (i.e. it uses mboxo format instead of
+        mboxrd).  So we don't use its mangling, and do it by hand instead.
         '''
-        f = cStringIO.StringIO()
         if include_from:
-            # This needs to be written out first, so we can't rely on the
-            # generator
-            f.write('From %s %s' % (mbox_from_escape(self.sender),
-                                    time.asctime()) + os.linesep)
+            # Mbox-style From line, not rfc822 From: header field.
+            fromline = 'From %s %s' % (mbox_from_escape(self.sender),
+                                       time.asctime()) + os.linesep
+        else:
+            fromline = ''
         # Write the Return-Path: header
-        f.write(format_header('Return-Path', '<%s>' % self.sender))
+        rpline = format_header('Return-Path', '<%s>' % self.sender)
         # Remove previous Return-Path: header fields.
         del self.__msg['Return-Path']
         if delivered_to:
-            f.write(format_header('Delivered-To', self.recipient or 'unknown'))
+            dtline = format_header('Delivered-To', self.recipient or 'unknown')
+        else:
+            dtline = ''
         if received:
             content = 'from %s by %s with %s' % (
                 self.received_from, self.received_by, self.received_with
@@ -151,13 +161,20 @@ class Message(object):
                 content += ' for <%s>' % self.recipient
             content += '; ' + time.strftime('%d %b %Y %H:%M:%S -0000',
                                             time.gmtime())
-            f.write(format_header('Received', content))
-        gen = Generator(f, mangle_from, 0)
+            receivedline = format_header('Received', content)
+        else:
+            receivedline = ''
         # From_ handled above, always tell the generator not to include it
         try:
+            tmpf = cStringIO.StringIO()
+            gen = Generator(tmpf, False, 0)
             gen.flatten(self.__msg, False)
-            f.seek(0)
-            return os.linesep.join(f.read().splitlines() + [''])
+            strmsg = tmpf.getvalue()
+            if mangle_from:
+                # do mboxrd-style "From " line quoting
+                strmsg = RE_FROMLINE.sub(r'>\1', strmsg)
+            return (fromline + rpline + dtline + receivedline 
+                    + os.linesep.join(strmsg.splitlines() + ['']))
         except TypeError, o:
             # email module chokes on some badly-misformatted messages, even
             # late during flatten().  Hope this is fixed in Python 2.4.

Reply to: