Bug#656288: Bug#625509: python-debian: please port to Py3k

To: Tshepang Lekhonkhobe <tshepang@gmail.com>, 625509@bugs.debian.org
Cc: Julian Andres Klode <jak@debian.org>, 656288@bugs.debian.org
Subject: Bug#656288: Bug#625509: python-debian: please port to Py3k
From: Colin Watson <cjwatson@ubuntu.com>
Date: Sun, 22 Jan 2012 14:37:55 +0000
Message-id: <[🔎] 20120122143755.GB14682@riva.dynamic.greenend.org.uk>
Reply-to: Colin Watson <cjwatson@ubuntu.com>, 656288@bugs.debian.org
In-reply-to: <20120118105428.GA11247@riva.dynamic.greenend.org.uk>
References: <20110504011029.13209.31764.reportbug@debian.tauspace.local> <20120118105428.GA11247@riva.dynamic.greenend.org.uk>

On Wed, Jan 18, 2012 at 10:54:28AM +0000, Colin Watson wrote:
> On Wed, May 04, 2011 at 03:10:29AM +0200, Tshepang Lekhonkhobe wrote:
> > Can you either make this package capable of running for Python 2 and 3,
> > or make separate packages for it, as python-apt does.
> 
> I'm working on this here:
> 
>   http://anonscm.debian.org/gitweb/?p=users/cjwatson/python-debian.git;a=shortlog;h=refs/heads/python3
> 
> I will probably end up depending on the six module, which I uploaded to
> unstable yesterday.  It's tiny, so I shouldn't expect this to cause much
> of a problem.
> 
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=656288 in python3-apt
> is getting in the way a bit, but I suppose worst case I can just skip
> those tests when running under Python 3 for now.

I believe this port is now complete, in the git branch above.  It passes
all tests provided that a version of python3-apt with the most recent
patch in #656288 is available.

I would very much appreciate review of this branch.  In case it eases
review, I've attached the 31-patch series (!) to this mail.  I've tried
to arrange it roughly in ascending order of complexity.

Cheers,

-- 
Colin Watson                                       [cjwatson@ubuntu.com]

>From 1b23822199b02e5076e04dfa00884982ceec0c25 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 13 Jan 2012 00:04:10 +0000
Subject: [PATCH 01/31] Fix test warnings with python2.7 -3.

---
 lib/debian/deb822.py         |   14 ++++++++------
 lib/debian/debian_support.py |    5 +----
 lib/debian/debtags.py        |   20 ++++++++++----------
 tests/test_debian_support.py |    2 +-
 4 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index 4c5b74e..7e8d0a6 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -169,7 +169,7 @@ class Deb822Dict(object, UserDict.DictMixin):
             if _fields is None:
                 self.__keys.extend([ _strI(k) for k in self.__parsed.keys() ])
             else:
-                self.__keys.extend([ _strI(f) for f in _fields if self.__parsed.has_key(f) ])
+                self.__keys.extend([ _strI(f) for f in _fields if f in self.__parsed ])
         
     ### BEGIN DictMixin methods
 
@@ -221,7 +221,7 @@ class Deb822Dict(object, UserDict.DictMixin):
             # only been in the self.__parsed dict.
             pass
 
-    def has_key(self, key):
+    def __contains__(self, key):
         key = _strI(key)
         return key in self.__keys
     
@@ -246,6 +246,8 @@ class Deb822Dict(object, UserDict.DictMixin):
         # If we got here, everything matched
         return True
 
+    __hash__ = None
+
     def copy(self):
         # Use self.__class__ so this works as expected for subclasses
         copy = self.__class__(self)
@@ -665,7 +667,7 @@ class GpgInfo(dict):
 
     def valid(self):
         """Is the signature valid?"""
-        return self.has_key('GOODSIG') or self.has_key('VALIDSIG')
+        return 'GOODSIG' in self or 'VALIDSIG' in self
     
 # XXX implement as a property?
 # XXX handle utf-8 %-encoding
@@ -846,9 +848,9 @@ class PkgRelation(object):
 
         def pp_atomic_dep(dep):
             s = dep['name']
-            if dep.has_key('version') and dep['version'] is not None:
+            if dep.get('version') is not None:
                 s += ' (%s %s)' % dep['version']
-            if dep.has_key('arch') and dep['arch'] is not None:
+            if dep.get('arch') is not None:
                 s += ' [%s]' % string.join(map(pp_arch, dep['arch']))
             return s
 
@@ -894,7 +896,7 @@ class _PkgRelationMixin(object):
             # name) of Deb822 objects on the dictionary returned by the
             # relations property.
             keyname = name.lower()
-            if self.has_key(name):
+            if name in self:
                 self.__relations[keyname] = None   # lazy value
                     # all lazy values will be expanded before setting
                     # __parsed_relations to True
diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index d9ce24a..f0577ac 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -358,10 +358,7 @@ def list_releases():
 listReleases = function_deprecated_by(list_releases)
 
 def intern_release(name, releases=list_releases()):
-    if releases.has_key(name):
-        return releases[name]
-    else:
-        return None
+    return releases.get(name)
 
 internRelease = function_deprecated_by(intern_release)
 
diff --git a/lib/debian/debtags.py b/lib/debian/debtags.py
index cc44f14..526394d 100644
--- a/lib/debian/debtags.py
+++ b/lib/debian/debtags.py
@@ -51,7 +51,7 @@ def read_tag_database_reversed(input):
 	for pkgs, tags in parse_tags(input):
 		# Create the tag set using the native set
 		for tag in tags:
-			if db.has_key(tag):
+			if tag in db:
 				db[tag] |= pkgs
 			else:
 				db[tag] = pkgs.copy()
@@ -72,7 +72,7 @@ def read_tag_database_both_ways(input, tag_filter = None):
 		for pkg in pkgs:
 			db[pkg] = tags.copy()
 		for tag in tags:
-			if dbr.has_key(tag):
+			if tag in dbr:
 				dbr[tag] |= pkgs
 			else:
 				dbr[tag] = pkgs.copy()
@@ -85,7 +85,7 @@ def reverse(db):
 	res = {}
 	for pkg, tags in db.items():
 		for tag in tags:
-			if not res.has_key(tag):
+			if tag not in res:
 				res[tag] = set()
 			res[tag].add(pkg)
 	return res
@@ -165,7 +165,7 @@ class DB:
 	def insert(self, pkg, tags):
 		self.db[pkg] = tags.copy()
 		for tag in tags:
-			if self.rdb.has_key(tag):
+			if tag in self.rdb:
 				self.rdb[tag].add(pkg)
 			else:
 				self.rdb[tag] = set((pkg))
@@ -229,7 +229,7 @@ class DB:
 		res = DB()
 		db = {}
 		for pkg in package_iter:
-			if self.db.has_key(pkg): db[pkg] = self.db[pkg]
+			if pkg in self.db: db[pkg] = self.db[pkg]
 		res.db = db
 		res.rdb = reverse(db)
 		return res
@@ -349,25 +349,25 @@ class DB:
 
 	def has_package(self, pkg):
 		"""Check if the collection contains the given package"""
-		return self.db.has_key(pkg)
+		return pkg in self.db
 
 	hasPackage = function_deprecated_by(has_package)
 
 	def has_tag(self, tag):
 		"""Check if the collection contains packages tagged with tag"""
-		return self.rdb.has_key(tag)
+		return tag in self.rdb
 
 	hasTag = function_deprecated_by(has_tag)
 
 	def tags_of_package(self, pkg):
 		"""Return the tag set of a package"""
-		return self.db.has_key(pkg) and self.db[pkg] or set()
+		return pkg in self.db and self.db[pkg] or set()
 
 	tagsOfPackage = function_deprecated_by(tags_of_package)
 
 	def packages_of_tag(self, tag):
 		"""Return the package set of a tag"""
-		return self.rdb.has_key(tag) and self.rdb[tag] or set()
+		return tag in self.rdb and self.rdb[tag] or set()
 
 	packagesOfTag = function_deprecated_by(packages_of_tag)
 
@@ -399,7 +399,7 @@ class DB:
 		"""
 		Return the cardinality of a tag
 		"""
-		return self.rdb.has_key(tag) and len(self.rdb[tag]) or 0
+		return tag in self.rdb and len(self.rdb[tag]) or 0
 
 	def discriminance(self, tag):
 		"""
diff --git a/tests/test_debian_support.py b/tests/test_debian_support.py
index 205c037..d29fb1a 100755
--- a/tests/test_debian_support.py
+++ b/tests/test_debian_support.py
@@ -141,7 +141,7 @@ class VersionTests(unittest.TestCase):
             v1 = cls1(v1_str)
             v2 = cls2(v2_str)
             truth_fn = self._get_truth_fn(cmp_oper)
-            self.failUnless(truth_fn(v1, v2) == True,
+            self.assertTrue(truth_fn(v1, v2) == True,
                             "%r %s %r != True" % (v1, cmp_oper, v2))
 
     def test_comparisons(self):
-- 
1.7.8.3

>From d649b4b23ff82fb3743008548fad59cc071d0f41 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 13 Jan 2012 00:19:57 +0000
Subject: [PATCH 02/31] Avoid various old syntactic forms which are no longer
 present in Python 3.  All of these work from at least
 Python 2.6.

---
 examples/deb822/grep-maintainer |    2 +-
 examples/debtags/wxssearch      |    1 -
 lib/debian/arfile.py            |   14 +++++++-------
 lib/debian/deb822.py            |   13 ++++++-------
 lib/debian/debian_support.py    |   32 +++++++++++++++-----------------
 tests/test_deb822.py            |    9 ---------
 6 files changed, 29 insertions(+), 42 deletions(-)

diff --git a/examples/deb822/grep-maintainer b/examples/deb822/grep-maintainer
index 24f2f84..53a956c 100755
--- a/examples/deb822/grep-maintainer
+++ b/examples/deb822/grep-maintainer
@@ -19,7 +19,7 @@ try:
 except IndexError:
     print >>sys.stderr, "Usage: grep-maintainer REGEXP"
     sys.exit(1)
-except re.error, e:
+except re.error as e:
     print >>sys.stderr, "Error in the regexp: %s" % (e,)
     sys.exit(1)
 
diff --git a/examples/debtags/wxssearch b/examples/debtags/wxssearch
index 4fb3c4c..a0aeb37 100755
--- a/examples/debtags/wxssearch
+++ b/examples/debtags/wxssearch
@@ -248,7 +248,6 @@ class Results(wx.ListCtrl):
 
     def model_changed(self, event):
         self.packages = sorted(self.model.subcoll.iter_packages())
-        self.packages.sort()
         self.SetItemCount(len(self.packages))
         self.resize_columns()
         event.Skip()
diff --git a/lib/debian/arfile.py b/lib/debian/arfile.py
index 9ad757e..c9cb87a 100644
--- a/lib/debian/arfile.py
+++ b/lib/debian/arfile.py
@@ -53,10 +53,10 @@ class ArFile(object):
         elif self.__fileobj:
             fp = self.__fileobj
         else:
-            raise ArError, "Unable to open valid file"
+            raise ArError("Unable to open valid file")
 
         if fp.read(GLOBAL_HEADER_LENGTH) != GLOBAL_HEADER:
-            raise ArError, "Unable to find global header"
+            raise ArError("Unable to find global header")
 
         while True:
             newmember = ArMember.from_file(fp, self.__fname)
@@ -100,12 +100,12 @@ class ArFile(object):
     def extractall():
         """ Not (yet) implemented. """
 
-        raise NotImpelementedError  # TODO
+        raise NotImplementedError  # TODO
 
     def extract(self, member, path):
         """ Not (yet) implemented. """
 
-        raise NotImpelementedError  # TODO
+        raise NotImplementedError  # TODO
 
     def extractfile(self, member):
         """ Return a file object corresponding to the requested member. A member
@@ -171,10 +171,10 @@ class ArMember(object):
 
         # sanity checks
         if len(buf) < FILE_HEADER_LENGTH:
-            raise IOError, "Incorrect header length"
+            raise IOError("Incorrect header length")
 
         if buf[58:60] != FILE_MAGIC:
-            raise IOError, "Incorrect file magic"
+            raise IOError("Incorrect file magic")
 
         # http://en.wikipedia.org/wiki/Ar_(Unix)    
         #from   to     Name                      Format
@@ -263,7 +263,7 @@ class ArMember(object):
             self.__fp.seek(self.__offset)
 
         if whence < 2 and offset + self.__fp.tell() < self.__offset:
-            raise IOError, "Can't seek at %d" % offset
+            raise IOError("Can't seek at %d" % offset)
         
         if whence == 1:
             self.__fp.seek(offset, 1)
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index 7e8d0a6..f838da8 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -192,7 +192,7 @@ class Deb822Dict(object, UserDict.DictMixin):
             # Always return unicode objects instead of strings
             try:
                 value = value.decode(self.encoding)
-            except UnicodeDecodeError, e:
+            except UnicodeDecodeError as e:
                 # Evidently, the value wasn't encoded with the encoding the
                 # user specified.  Try detecting it.
                 warnings.warn('decoding from %s failed; attempting to detect '
@@ -234,8 +234,8 @@ class Deb822Dict(object, UserDict.DictMixin):
         return '{%s}' % ', '.join(['%r: %r' % (k, v) for k, v in self.items()])
 
     def __eq__(self, other):
-        mykeys = self.keys(); mykeys.sort()
-        otherkeys = other.keys(); otherkeys.sort()
+        mykeys = sorted(self.keys())
+        otherkeys = sorted(other.keys())
         if not mykeys == otherkeys:
             return False
 
@@ -483,8 +483,7 @@ class Deb822(Deb822Dict):
             if (s1 + s2).count(', '):
                 delim = ', '
 
-            L = (s1 + delim + s2).split(delim)
-            L.sort()
+            L = sorted((s1 + delim + s2).split(delim))
 
             prev = merged = L[0]
 
@@ -619,7 +618,7 @@ class Deb822(Deb822Dict):
         # _gpg_multivalued.__init__) which is small compared to Packages or
         # Sources which contain no signature
         if not hasattr(self, 'raw_text'):
-            raise ValueError, "original text cannot be found"
+            raise ValueError("original text cannot be found")
 
         if self.gpg_info is None:
             self.gpg_info = GpgInfo.from_sequence(self.raw_text,
@@ -736,7 +735,7 @@ class GpgInfo(dict):
             args.extend(["--keyring", k])
         
         if "--keyring" not in args:
-            raise IOError, "cannot access any of the given keyrings"
+            raise IOError("cannot access any of the given keyrings")
 
         p = subprocess.Popen(args, stdin=subprocess.PIPE,
                              stdout=subprocess.PIPE, stderr=subprocess.PIPE)
diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index f0577ac..a5d5a3d 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -53,9 +53,9 @@ class ParseError(Exception):
         return self.msg
 
     def __repr__(self):
-        return "ParseError(%s, %d, %s)" % (`self.filename`,
+        return "ParseError(%s, %d, %s)" % (repr(self.filename),
                                            self.lineno,
-                                           `self.msg`)
+                                           repr(self.msg))
 
     def print_out(self, file):
         """Writes a machine-parsable error message to file."""
@@ -192,7 +192,7 @@ class NativeVersion(BaseVersion):
         if not isinstance(other, BaseVersion):
             try:
                 other = BaseVersion(str(other))
-            except ValueError, e:
+            except ValueError as e:
                 raise ValueError("Couldn't convert %r to BaseVersion: %s"
                                  % (other, e))
 
@@ -337,7 +337,7 @@ class PseudoEnum:
         self._name = name
         self._order = order
     def __repr__(self):
-        return '%s(%s)'% (self.__class__._name__, `name`)
+        return '%s(%s)'% (self.__class__._name__, repr(name))
     def __str__(self):
         return self._name
     def __cmp__(self, other):
@@ -392,7 +392,7 @@ def patches_from_ed_script(source,
     for line in i:
         match = re_cmd.match(line)
         if match is None:
-            raise ValueError, "invalid patch command: " + `line`
+            raise ValueError("invalid patch command: " + repr(line))
 
         (first, last, cmd) = match.groups()
         first = int(first)
@@ -408,7 +408,7 @@ def patches_from_ed_script(source,
 
         if cmd == 'a':
             if last is not None:
-                raise ValueError, "invalid patch argument: " + `line`
+                raise ValueError("invalid patch argument: " + repr(line))
             last = first
         else:                           # cmd == c
             first = first - 1
@@ -418,7 +418,7 @@ def patches_from_ed_script(source,
         lines = []
         for l in i:
             if l == '':
-                raise ValueError, "end of stream in command: " + `line`
+                raise ValueError("end of stream in command: " + repr(line))
             if l == '.\n' or l == '.':
                 break
             lines.append(l)
@@ -561,7 +561,7 @@ def update_file(remote, local, verbose=None):
                 continue
             
             if verbose:
-                print "update_file: field %s ignored" % `field`
+                print "update_file: field %s ignored" % repr(field)
         
     if not patches_to_apply:
         if verbose:
@@ -569,17 +569,17 @@ def update_file(remote, local, verbose=None):
         return download_file(remote, local)
 
     for patch_name in patches_to_apply:
-        print "update_file: downloading patch " + `patch_name`
+        print "update_file: downloading patch " + repr(patch_name)
         patch_contents = download_gunzip_lines(remote + '.diff/' + patch_name
                                           + '.gz')
-        if read_lines_sha1(patch_contents ) <> patch_hashes[patch_name]:
-            raise ValueError, "patch %s was garbled" % `patch_name`
+        if read_lines_sha1(patch_contents ) != patch_hashes[patch_name]:
+            raise ValueError("patch %s was garbled" % repr(patch_name))
         patch_lines(lines, patches_from_ed_script(patch_contents))
         
     new_hash = read_lines_sha1(lines)
-    if new_hash <> remote_hash:
-        raise ValueError, ("patch failed, got %s instead of %s"
-                           % (new_hash, remote_hash))
+    if new_hash != remote_hash:
+        raise ValueError("patch failed, got %s instead of %s"
+                         % (new_hash, remote_hash))
 
     replace_file(lines, local)
     return lines
@@ -593,8 +593,6 @@ def merge_as_sets(*args):
     for x in args:
         for y in x:
             s[y] = True
-    l = s.keys()
-    l.sort()
-    return l
+    return sorted(s.keys())
 
 mergeAsSets = function_deprecated_by(merge_as_sets)
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index c3806bd..82c3540 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -29,15 +29,6 @@ sys.path.insert(0, '../lib/debian/')
 
 import deb822
 
-# Keep the test suite compatible with python2.3 for now
-try:
-    sorted
-except NameError:
-    def sorted(iterable, cmp=None):
-        tmp = iterable[:]
-        tmp.sort(cmp)
-        return tmp
-
 
 UNPARSED_PACKAGE = '''\
 Package: mutt
-- 
1.7.8.3

>From d10d6cb855f11814235e87e3188619a652cbf96a Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 13 Jan 2012 00:36:45 +0000
Subject: [PATCH 03/31] Use Python 3-style print function.

---
 lib/debian/arfile.py         |    2 +-
 lib/debian/deb822.py         |    8 +++++---
 lib/debian/debfile.py        |    2 +-
 lib/debian/debian_support.py |   14 +++++++-------
 lib/debian/debtags.py        |    2 +-
 lib/debian/doc-debtags       |   30 +++++++++++++++---------------
 6 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/lib/debian/arfile.py b/lib/debian/arfile.py
index c9cb87a..6c5d252 100644
--- a/lib/debian/arfile.py
+++ b/lib/debian/arfile.py
@@ -311,4 +311,4 @@ if __name__ == '__main__':
     # test
     # ar r test.ar <file1> <file2> .. <fileN>
     a = ArFile("test.ar")
-    print "\n".join(a.getnames())
+    print("\n".join(a.getnames()))
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index f838da8..d50ad2b 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -22,6 +22,8 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
 
+from __future__ import print_function
+
 from deprecation import function_deprecated_by
 
 try:
@@ -822,9 +824,9 @@ class PkgRelation(object):
                     d['arch'] = parse_archs(parts['archs'])
                 return d
             else:
-                print >> sys.stderr, \
-                        'deb822.py: WARNING: cannot parse package' \
-                        ' relationship "%s", returning it raw' % raw
+                print('deb822.py: WARNING: cannot parse package' \
+                      ' relationship "%s", returning it raw' % raw,
+                      file=sys.stderr)
                 return { 'name': raw, 'version': None, 'arch': None }
 
         tl_deps = cls.__comma_sep_RE.split(raw.strip()) # top-level deps
diff --git a/lib/debian/debfile.py b/lib/debian/debfile.py
index a2a62f6..8ccfd12 100644
--- a/lib/debian/debfile.py
+++ b/lib/debian/debfile.py
@@ -278,5 +278,5 @@ if __name__ == '__main__':
     import sys
     deb = DebFile(filename=sys.argv[1])
     tgz = deb.control.tgz()
-    print tgz.getmember('control')
+    print(tgz.getmember('control'))
 
diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index a5d5a3d..368deb0 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -499,7 +499,7 @@ def update_file(remote, local, verbose=None):
         local_file = file(local)
     except IOError:
         if verbose:
-            print "update_file: no local copy, downloading full file"
+            print("update_file: no local copy, downloading full file")
         return download_file(remote, local)
 
     lines = local_file.readlines()
@@ -520,11 +520,11 @@ def update_file(remote, local, verbose=None):
         # FIXME: urllib does not raise a proper exception, so we parse
         # the error message.
         if verbose:
-            print "update_file: could not interpret patch index file"
+            print("update_file: could not interpret patch index file")
         return download_file(remote, local)
     except IOError:
         if verbose:
-            print "update_file: could not download patch index file"
+            print("update_file: could not download patch index file")
         return download_file(remote, local)
 
     for fields in index_fields:
@@ -533,7 +533,7 @@ def update_file(remote, local, verbose=None):
                 (remote_hash, remote_size) = re_whitespace.split(value)
                 if local_hash == remote_hash:
                     if verbose:
-                        print "update_file: local file is up-to-date"
+                        print("update_file: local file is up-to-date")
                     return lines
                 continue
 
@@ -561,15 +561,15 @@ def update_file(remote, local, verbose=None):
                 continue
             
             if verbose:
-                print "update_file: field %s ignored" % repr(field)
+                print("update_file: field %s ignored" % repr(field))
         
     if not patches_to_apply:
         if verbose:
-            print "update_file: could not find historic entry", local_hash
+            print("update_file: could not find historic entry", local_hash)
         return download_file(remote, local)
 
     for patch_name in patches_to_apply:
-        print "update_file: downloading patch " + repr(patch_name)
+        print("update_file: downloading patch " + repr(patch_name))
         patch_contents = download_gunzip_lines(remote + '.diff/' + patch_name
                                           + '.gz')
         if read_lines_sha1(patch_contents ) != patch_hashes[patch_name]:
diff --git a/lib/debian/debtags.py b/lib/debian/debtags.py
index 526394d..0b62880 100644
--- a/lib/debian/debtags.py
+++ b/lib/debian/debtags.py
@@ -96,7 +96,7 @@ def output(db):
 	for pkg, tags in db.items():
 		# Using % here seems awkward to me, but if I use calls to
 		# sys.stdout.write it becomes a bit slower
-		print "%s:" % (pkg), ", ".join(tags)
+		print("%s:" % (pkg), ", ".join(tags))
 
 
 def relevance_index_function(full, sub):
diff --git a/lib/debian/doc-debtags b/lib/debian/doc-debtags
index fecc77f..16503f3 100755
--- a/lib/debian/doc-debtags
+++ b/lib/debian/doc-debtags
@@ -15,10 +15,10 @@ def document (callable):
 	if callable.__doc__ != None:
 		print_indented(2, callable.__name__)
 		print_indented(4, inspect.getdoc(callable))
-		print
+		print()
 
 
-print """debtags.py README
+print("""debtags.py README
 =================
 
 The Debtags python module provides support for accessing and manipulating
@@ -41,24 +41,24 @@ Classes
 =======
 
 There is only one class: debtags.DB:
-"""
+""")
 
 document (debtags.DB)
 
-print """
+print("""
 The methods of debtags.DB are:
-"""
+""")
 
 for m in dir(debtags.DB):
 	if m[0:2] != '__' and callable(getattr(debtags.DB, m)):
 		document(getattr(debtags.DB, m))
 
-print """Iteration
+print("""Iteration
 =========
 
 debtags.DB provides various iteration methods to iterate the collection either
 in a package-centered or in a tag-centered way:
-"""
+""")
 
 document(debtags.DB.iter_packages)
 document(debtags.DB.iter_packages_tags)
@@ -66,7 +66,7 @@ document(debtags.DB.iter_tags)
 document(debtags.DB.iter_tags_packages)
 
 
-print """Sample usage
+print("""Sample usage
 ============
 
 This example reads the system debtags database and performs a simple tag
@@ -76,10 +76,10 @@ search::
     
     db = debtags.DB()
     db.read(open("/var/lib/debtags/package-tags", "r"))
-    print db.package_count(), "packages in the database"
-    print "Image editors:"
+    print(db.package_count(), "packages in the database")
+    print("Image editors:")
     for pkg in db.packages_of_tags(set(("use::editing", "works-with::image:raster"))):
-    	print " *", pkg
+    	print(" *", pkg)
 
 This example computes the set of tags that belong to all the packages in a
 list, then shows all the other packages that have those tags:
@@ -89,10 +89,10 @@ list, then shows all the other packages that have those tags:
     db = debtags.DB()
     db.read(open("/var/lib/debtags/package-tags", "r"))
     tags = db.tags_of_packages(("gimp", "krita"))
-    print "Common tags:"
+    print("Common tags:")
     for tag in tags:
-	print " *", tag
-    print "Packages similar to gimp and krita:"
+	print(" *", tag)
+    print("Packages similar to gimp and krita:")
     for pkg in db.packages_of_tags(tags):
-	print " *", pkg
+	print(" *", pkg)
 """
-- 
1.7.8.3

>From ca13c2f50ed0e1cd8e372e04fa0b2353087fc5cd Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 13 Jan 2012 16:08:44 +0000
Subject: [PATCH 04/31] Use a list comprehension instead of map, which returns
 an iterator in Python 3.

---
 lib/debian/arfile.py |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/debian/arfile.py b/lib/debian/arfile.py
index 6c5d252..a9b132a 100644
--- a/lib/debian/arfile.py
+++ b/lib/debian/arfile.py
@@ -95,7 +95,7 @@ class ArFile(object):
     def getnames(self):
         """ Return a list of all member names in the archive. """
 
-        return map(lambda f: f.name, self.__members)
+        return [f.name for f in self.__members]
 
     def extractall():
         """ Not (yet) implemented. """
-- 
1.7.8.3

>From 1fd206402ece280395389006fc4977d39a17fa31 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 13 Jan 2012 16:10:19 +0000
Subject: [PATCH 05/31] Use iterkeys/iteritems when an iterator is all we
 need.

---
 lib/debian/changelog.py |    4 ++--
 lib/debian/deb822.py    |    6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/debian/changelog.py b/lib/debian/changelog.py
index 93c348f..ae2d1fd 100644
--- a/lib/debian/changelog.py
+++ b/lib/debian/changelog.py
@@ -94,7 +94,7 @@ class ChangeBlock(object):
 
     def other_keys_normalised(self):
         norm_dict = {}
-        for (key, value) in other_pairs.items():
+        for (key, value) in other_pairs.iteritems():
             key = key[0].upper() + key[1:].lower()
             m = xbcs_re.match(key)
             if m is None:
@@ -143,7 +143,7 @@ class ChangeBlock(object):
         if self.urgency is None:
             raise ChangelogCreateError("Urgency not specified")
         block += "urgency=" + self.urgency + self.urgency_comment
-        for (key, value) in self.other_pairs.items():
+        for (key, value) in self.other_pairs.iteritems():
             block += ", %s=%s" % (key, value)
         block += '\n'
         if self.changes() is None:
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index d50ad2b..62a464f 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -169,7 +169,7 @@ class Deb822Dict(object, UserDict.DictMixin):
         if _parsed is not None:
             self.__parsed = _parsed
             if _fields is None:
-                self.__keys.extend([ _strI(k) for k in self.__parsed.keys() ])
+                self.__keys.extend([ _strI(k) for k in self.__parsed.iterkeys() ])
             else:
                 self.__keys.extend([ _strI(f) for f in _fields if f in self.__parsed ])
         
@@ -236,8 +236,8 @@ class Deb822Dict(object, UserDict.DictMixin):
         return '{%s}' % ', '.join(['%r: %r' % (k, v) for k, v in self.items()])
 
     def __eq__(self, other):
-        mykeys = sorted(self.keys())
-        otherkeys = sorted(other.keys())
+        mykeys = sorted(self.iterkeys())
+        otherkeys = sorted(other.iterkeys())
         if not mykeys == otherkeys:
             return False
 
-- 
1.7.8.3

>From bf53d2f9a95ff31894bcba7f2dc3464b213b9eb9 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 13 Jan 2012 16:12:59 +0000
Subject: [PATCH 06/31] Use absolute imports.

---
 lib/debian/changelog.py      |    2 +-
 lib/debian/deb822.py         |    2 +-
 lib/debian/debfile.py        |    6 +++---
 lib/debian/debian_support.py |    2 +-
 lib/debian/debtags.py        |    2 +-
 tests/test_changelog.py      |    4 ++--
 tests/test_deb822.py         |    4 ++--
 tests/test_debfile.py        |    6 +++---
 tests/test_debian_support.py |    6 +++---
 tests/test_debtags.py        |    4 ++--
 10 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/lib/debian/changelog.py b/lib/debian/changelog.py
index ae2d1fd..d6a8f7a 100644
--- a/lib/debian/changelog.py
+++ b/lib/debian/changelog.py
@@ -29,7 +29,7 @@ import re
 import socket
 import warnings
 
-import debian_support
+from debian import debian_support
 
 class ChangelogParseError(StandardError):
     """Indicates that the changelog could not be parsed"""
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index 62a464f..6db213a 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -24,7 +24,7 @@
 
 from __future__ import print_function
 
-from deprecation import function_deprecated_by
+from debian.deprecation import function_deprecated_by
 
 try:
     import apt_pkg
diff --git a/lib/debian/debfile.py b/lib/debian/debfile.py
index 8ccfd12..02ab368 100644
--- a/lib/debian/debfile.py
+++ b/lib/debian/debfile.py
@@ -18,9 +18,9 @@
 import gzip
 import tarfile
 
-from arfile import ArFile, ArError
-from changelog import Changelog
-from deb822 import Deb822
+from debian.arfile import ArFile, ArError
+from debian.changelog import Changelog
+from debian.deb822 import Deb822
 
 DATA_PART = 'data.tar'      # w/o extension
 CTRL_PART = 'control.tar'
diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index 368deb0..f8dc1c1 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -23,7 +23,7 @@ import re
 import hashlib
 import types
 
-from deprecation import function_deprecated_by
+from debian.deprecation import function_deprecated_by
 
 try:
     import apt_pkg
diff --git a/lib/debian/debtags.py b/lib/debian/debtags.py
index 0b62880..d3df6f7 100644
--- a/lib/debian/debtags.py
+++ b/lib/debian/debtags.py
@@ -17,7 +17,7 @@
 
 import re, cPickle
 
-from deprecation import function_deprecated_by
+from debian.deprecation import function_deprecated_by
 
 def parse_tags(input):
 	lre = re.compile(r"^(.+?)(?::?\s*|:\s+(.+?)\s*)$")
diff --git a/tests/test_changelog.py b/tests/test_changelog.py
index 65e7d66..b1b0067 100755
--- a/tests/test_changelog.py
+++ b/tests/test_changelog.py
@@ -27,9 +27,9 @@
 import sys
 import unittest
 
-sys.path.insert(0, '../lib/debian/')
+sys.path.insert(0, '../lib/')
 
-import changelog
+from debian import changelog
 
 class ChangelogTests(unittest.TestCase):
 
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 82c3540..39249b0 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -25,9 +25,9 @@ import unittest
 import warnings
 from StringIO import StringIO
 
-sys.path.insert(0, '../lib/debian/')
+sys.path.insert(0, '../lib/')
 
-import deb822
+from debian import deb822
 
 
 UNPARSED_PACKAGE = '''\
diff --git a/tests/test_debfile.py b/tests/test_debfile.py
index b37dfd7..c594cf0 100755
--- a/tests/test_debfile.py
+++ b/tests/test_debfile.py
@@ -25,10 +25,10 @@ import sys
 import tempfile
 import uu
 
-sys.path.insert(0, '../lib/debian/')
+sys.path.insert(0, '../lib/')
 
-import arfile
-import debfile
+from debian import arfile
+from debian import debfile
 
 class TestArFile(unittest.TestCase):
 
diff --git a/tests/test_debian_support.py b/tests/test_debian_support.py
index d29fb1a..ab60ba9 100755
--- a/tests/test_debian_support.py
+++ b/tests/test_debian_support.py
@@ -21,10 +21,10 @@
 import sys
 import unittest
 
-sys.path.insert(0, '../lib/debian/')
+sys.path.insert(0, '../lib/')
 
-import debian_support
-from debian_support import *
+from debian import debian_support
+from debian.debian_support import *
 
 
 class VersionTests(unittest.TestCase):
diff --git a/tests/test_debtags.py b/tests/test_debtags.py
index cbe6674..27de759 100755
--- a/tests/test_debtags.py
+++ b/tests/test_debtags.py
@@ -20,8 +20,8 @@
 import sys
 import unittest
 
-sys.path.insert(0, '../lib/debian/')
-import debtags
+sys.path.insert(0, '../lib/')
+from debian import debtags
 
 class TestDebtags(unittest.TestCase):
     def mkdb(self):
-- 
1.7.8.3

>From 24b912d6fa30a7897ad4df30023aee902ec8536f Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 17:49:32 +0000
Subject: [PATCH 07/31] Use Python 3-style print function in examples.

---
 examples/changelog/changelog_to_file    |    2 +-
 examples/changelog/simple_changelog     |    2 +-
 examples/deb822/depgraph                |   10 +++---
 examples/deb822/grep-maintainer         |    8 +++--
 examples/deb822/grep_native_packages.py |    2 +-
 examples/deb822/render-dctrl            |   34 ++++++++++++------------
 examples/debfile/ar                     |    8 +++---
 examples/debfile/changelog_head         |    6 ++--
 examples/debfile/dpkg-info              |   16 ++++++------
 examples/debfile/extract_cron           |    4 +-
 examples/debtags/pkgwalk                |    4 +-
 examples/debtags/reverse                |    4 ++-
 examples/debtags/smartsearch            |   42 ++++++++++++++++--------------
 examples/debtags/tagminer               |   14 ++++++----
 examples/debtags/tagsbyrelevance        |    8 +++--
 examples/debtags/wxssearch              |   12 +++++---
 16 files changed, 94 insertions(+), 82 deletions(-)

diff --git a/examples/changelog/changelog_to_file b/examples/changelog/changelog_to_file
index 03ab4a9..cbe1612 100755
--- a/examples/changelog/changelog_to_file
+++ b/examples/changelog/changelog_to_file
@@ -36,7 +36,7 @@ changelog.add_change('');
 try:
   filename = sys.argv[1]
 except IndexError:
-  print "Usage: "+sys.argv[0]+" filename"
+  print("Usage: "+sys.argv[0]+" filename")
   sys.exit(1)
 
 f = open(filename, 'w')
diff --git a/examples/changelog/simple_changelog b/examples/changelog/simple_changelog
index 86de0f7..844f6ea 100755
--- a/examples/changelog/simple_changelog
+++ b/examples/changelog/simple_changelog
@@ -32,5 +32,5 @@ changelog.add_change('');
 changelog.add_change('  * Welcome to changelog.py');
 changelog.add_change('');
 
-print changelog
+print(changelog)
 
diff --git a/examples/deb822/depgraph b/examples/deb822/depgraph
index d2b6782..388d564 100755
--- a/examples/deb822/depgraph
+++ b/examples/deb822/depgraph
@@ -23,7 +23,7 @@ __fresh_id = 0
 
 def main():
     if len(sys.argv) != 2:
-        print "Usage: depgraph PACKAGES_FILE"
+        print("Usage: depgraph PACKAGES_FILE")
         sys.exit(2)
 
     def get_id():
@@ -32,11 +32,11 @@ def main():
         return ("NODE_%d" % __fresh_id)
 
     def emit_arc(node1, node2):
-        print '  "%s" -> "%s" ;' % (node1, node2)
+        print('  "%s" -> "%s" ;' % (node1, node2))
     def emit_node(node, dsc):
-        print '  "%s" [label="%s"] ;' % (node, dsc)
+        print('  "%s" [label="%s"] ;' % (node, dsc))
 
-    print "digraph depgraph {"
+    print("digraph depgraph {")
     for pkg in deb822.Packages.iter_paragraphs(file(sys.argv[1])):
         name = pkg['package']
         rels = pkg.relations
@@ -52,7 +52,7 @@ def main():
                     # even though it is forbidden by policy, there are some
                     # dependencies with upper case letter in the archive,
                     # apparently apt-get turn them to lowercase ...
-    print "}"
+    print("}")
 
 if __name__ == '__main__':
     main()
diff --git a/examples/deb822/grep-maintainer b/examples/deb822/grep-maintainer
index 53a956c..358bf7e 100755
--- a/examples/deb822/grep-maintainer
+++ b/examples/deb822/grep-maintainer
@@ -10,6 +10,8 @@
 
 """Dumb maintainer-based grep for the dpkg status file."""
 
+from __future__ import print_function
+
 import re
 import sys
 from debian import deb822
@@ -17,13 +19,13 @@ from debian import deb822
 try:
     maint_RE = re.compile(sys.argv[1])
 except IndexError:
-    print >>sys.stderr, "Usage: grep-maintainer REGEXP"
+    print("Usage: grep-maintainer REGEXP", file=sys.stderr)
     sys.exit(1)
 except re.error as e:
-    print >>sys.stderr, "Error in the regexp: %s" % (e,)
+    print("Error in the regexp: %s" % (e,), file=sys.stderr)
     sys.exit(1)
 
 for pkg in deb822.Packages.iter_paragraphs(file('/var/lib/dpkg/status')):
     if pkg.has_key('Maintainer') and maint_RE.search(pkg['maintainer']):
-        print pkg['package']
+        print(pkg['package'])
 
diff --git a/examples/deb822/grep_native_packages.py b/examples/deb822/grep_native_packages.py
index ab8c5cc..9a3db74 100755
--- a/examples/deb822/grep_native_packages.py
+++ b/examples/deb822/grep_native_packages.py
@@ -17,6 +17,6 @@ for fname in sys.argv[1:]:
     for stanza in deb822.Sources.iter_paragraphs(f):
         pieces = stanza['version'].split('-')
         if len(pieces) < 2:
-            print stanza['package']
+            print(stanza['package'])
     f.close()
 
diff --git a/examples/deb822/render-dctrl b/examples/deb822/render-dctrl
index 081bd4e..a599339 100755
--- a/examples/deb822/render-dctrl
+++ b/examples/deb822/render-dctrl
@@ -93,7 +93,7 @@ def get_indent(s):
         return 0
 
 def render_longdesc(lines):
-    print '<div class="longdesc">'
+    print('<div class="longdesc">')
     lines = map(lambda s: s[1:], lines)	# strip 822 heading space
     curpara, paragraphs = [], []
     inlist, listindent = False, 0
@@ -127,40 +127,40 @@ def render_longdesc(lines):
         store_para()
 
     for p in paragraphs:	# render paragraphs
-        print markdown(p)
-    print '</div>'
+        print(markdown(p))
+    print('</div>')
 
 def render_field(field, val):
     field = field.lower()
-    print '<dt>%s</dt>' % field
-    print '<dd class="%s">' % field
+    print('<dt>%s</dt>' % field)
+    print('<dd class="%s">' % field)
     if field == 'description':
         lines = val.split('\n')
-        print '<span class="shortdesc">%s</span>' % lines[0]
+        print('<span class="shortdesc">%s</span>' % lines[0])
         render_longdesc(lines[1:])
     elif field == 'package':
-        print '<a href="#%s" class="uid">id</a>' % val
-        print '<span id="%s" class="package">%s</span>' % (val, val)
+        print('<a href="#%s" class="uid">id</a>' % val)
+        print('<span id="%s" class="package">%s</span>' % (val, val))
     elif field in []:	# fields not to be typeset as "raw"
-        print '<span class="%s">%s</span>' % (field, val)
+        print('<span class="%s">%s</span>' % (field, val))
     else:
-        print '<span class="raw">%s</span>' % val
-    print '</dd>'
+        print('<span class="raw">%s</span>' % val)
+    print('</dd>')
 
 def render_file(f):
     global options, html_header, html_trailer
 
     if options.print_header:
-        print html_header
+        print(html_header)
     for pkg in deb822.Packages.iter_paragraphs(f):
-        print '<div class="package">'
-        print '<dl class="fields">'
+        print('<div class="package">')
+        print('<dl class="fields">')
         for (field, val) in pkg.iteritems():
             render_field(field, val)
-        print '</dl>'
-        print '</div>\n'
+        print('</dl>')
+        print('</div>\n')
     if options.print_header:
-        print html_trailer
+        print(html_trailer)
 
 def main():
     global options, usage
diff --git a/examples/debfile/ar b/examples/debfile/ar
index ee99140..9dc5260 100755
--- a/examples/debfile/ar
+++ b/examples/debfile/ar
@@ -23,18 +23,18 @@ from debian import arfile
 
 if __name__ == '__main__':
     if len(sys.argv) < 3:
-        print "usage: arfile.py [tp] <arfile>"
+        print("usage: arfile.py [tp] <arfile>")
         sys.exit(1)
     
     if not os.path.exists(sys.argv[2]):
-        print "please provide a file to operate on"
+        print("please provide a file to operate on")
         sys.exit(1)
         
     a = arfile.ArFile(sys.argv[2])
 
     if sys.argv[1] == 't':
-        print "\n".join(a.getnames())
+        print("\n".join(a.getnames()))
     elif sys.argv[1] == 'p':
         for m in a.getmembers():
-            #print "".join(m.readlines())
+            #print("".join(m.readlines()))
             sys.stdout.write("".join(m.readlines()))
diff --git a/examples/debfile/changelog_head b/examples/debfile/changelog_head
index 4410f1c..c8a5a4d 100755
--- a/examples/debfile/changelog_head
+++ b/examples/debfile/changelog_head
@@ -18,8 +18,8 @@ from debian import debfile
 
 if __name__ == '__main__':
     if len(sys.argv) > 3 or len(sys.argv) < 2:
-        print "Usage: changelog_head DEB [ENTRIES]"
-        print "  ENTRIES defaults to 10"
+        print("Usage: changelog_head DEB [ENTRIES]")
+        print("  ENTRIES defaults to 10")
         sys.exit(1)
 
     entries = 10
@@ -31,5 +31,5 @@ if __name__ == '__main__':
     deb = debfile.DebFile(sys.argv[1])
     chg = deb.changelog()
     entries = chg._blocks[:entries]
-    print string.join(map(str, entries), '')
+    print(string.join(map(str, entries)))
 
diff --git a/examples/debfile/dpkg-info b/examples/debfile/dpkg-info
index b060bf4..3921a72 100755
--- a/examples/debfile/dpkg-info
+++ b/examples/debfile/dpkg-info
@@ -21,15 +21,15 @@ from debian import debfile
 
 if __name__ == '__main__':
     if len(sys.argv) != 2:
-        print "Usage: dpkg-info DEB"
+        print("Usage: dpkg-info DEB")
         sys.exit(1)
     fname = sys.argv[1]
 
     deb = debfile.DebFile(fname)
     if deb.version == '2.0':
-        print ' new debian package, version %s.' % deb.version
-    print ' size %d bytes: control archive= %d bytes.' % (
-            os.stat(fname)[stat.ST_SIZE], deb['control.tar.gz'].size)
+        print(' new debian package, version %s.' % deb.version)
+    print(' size %d bytes: control archive= %d bytes.' % (
+            os.stat(fname)[stat.ST_SIZE], deb['control.tar.gz'].size))
     for fname in deb.control:   # print info about control part contents
         content = deb.control[fname]
         if not content:
@@ -41,14 +41,14 @@ if __name__ == '__main__':
                 ftype = lines[0].split()[0]
         except IndexError:
             pass
-        print '  %d bytes, %d lines, %s, %s' % (len(content), len(lines),
-                fname, ftype)
+        print('  %d bytes, %d lines, %s, %s' % (len(content), len(lines),
+                fname, ftype))
     for n, v in deb.debcontrol().iteritems(): # print DEBIAN/control fields
         if n.lower() == 'description':  # increase indentation of long dsc
             lines = v.split('\n')
             short_dsc = lines[0]
             long_dsc = string.join(map(lambda l: ' ' + l, lines[1:]), '\n')
-            print ' %s: %s\n%s' % (n, short_dsc, long_dsc)
+            print(' %s: %s\n%s' % (n, short_dsc, long_dsc))
         else:
-            print ' %s: %s' % (n, v)
+            print(' %s: %s' % (n, v))
 
diff --git a/examples/debfile/extract_cron b/examples/debfile/extract_cron
index 2de005b..0b0ef49 100755
--- a/examples/debfile/extract_cron
+++ b/examples/debfile/extract_cron
@@ -21,14 +21,14 @@ def is_cron(fname):
 
 if __name__ == '__main__':
     if not sys.argv[1:]:
-        print "Usage: extract_cron DEB ..."
+        print("Usage: extract_cron DEB ...")
         sys.exit(1)
 
     for fname in sys.argv[1:]:
         deb = debfile.DebFile(fname)
         cron_files = filter(is_cron, list(deb.data))
         for cron_file in cron_files:
-            print 'Extracting cron-related file %s ...' % cron_file
+            print('Extracting cron-related file %s ...' % cron_file)
             path = os.path.join('.', cron_file)
             dir = os.path.dirname(path)
             if not os.path.exists(dir):
diff --git a/examples/debtags/pkgwalk b/examples/debtags/pkgwalk
index ed02aa4..b6cb8bf 100755
--- a/examples/debtags/pkgwalk
+++ b/examples/debtags/pkgwalk
@@ -99,7 +99,7 @@ if __name__ == '__main__':
 		for num, pkg in enumerate(display):
 			aptpkg = apt_cache[pkg]
 			desc = aptpkg.raw_description.split("\n")[0]
-			print "%2d) %s - %s" % (num + 1, pkg, desc)
+			print("%2d) %s - %s" % (num + 1, pkg, desc))
 
 		# Ask the user to choose a new package
 		while True:
@@ -115,7 +115,7 @@ if __name__ == '__main__':
 					trail = [display[num]] + trail[:maxlen]
 					break
 				else:
-					print "The number is too high"
+					print("The number is too high")
 
 
 # vim:set ts=4 sw=4:
diff --git a/examples/debtags/reverse b/examples/debtags/reverse
index b741f81..60c46fe 100755
--- a/examples/debtags/reverse
+++ b/examples/debtags/reverse
@@ -18,6 +18,8 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
+from __future__ import print_function
+
 import sys
 from debian import debtags
 
@@ -33,4 +35,4 @@ db = debtags.read_tag_database_reversed(input)
 for pkg, tags in db.items():
 	# Using % here seems awkward to me, but if I use calls to
 	# sys.stdout.write it becomes a bit slower
-	print "%s:" % (pkg), ", ".join(tags)
+	print("%s:" % (pkg), ", ".join(tags))
diff --git a/examples/debtags/smartsearch b/examples/debtags/smartsearch
index ec298cf..5873aa3 100755
--- a/examples/debtags/smartsearch
+++ b/examples/debtags/smartsearch
@@ -18,6 +18,8 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
+from __future__ import print_function
+
 import sys
 import re
 import subprocess
@@ -49,7 +51,7 @@ class SmartSearcher:
         pkgs = []
         for pkg in input.stdout:
             pkg, none = pkg.rstrip("\n").split(' - ', 1)
-            #print pkg
+            #print(pkg)
             pkgs.append(pkg)
 
         subcoll = self.fullcoll.choose_packages(pkgs)
@@ -74,15 +76,15 @@ class SmartSearcher:
     def show_set(self, tags, type):
         for tag in tags:
             self.tags_in_menu.append(tag)
-            print "%d) %s (%s)" % (len(self.tags_in_menu), tag, type)
+            print("%d) %s (%s)" % (len(self.tags_in_menu), tag, type))
 
     def show_choice_sequence(self, seq, max = 7):
         for tag in seq:
             if tag in self.wanted or tag in self.unwanted or tag in self.ignored:
                 continue
             self.tags_in_menu.append(tag)
-            print "%d) %s (%d/%d)" % \
-                (len(self.tags_in_menu), tag, self.subcoll.card(tag), self.subcoll.package_count())
+            print("%d) %s (%d/%d)" % \
+                (len(self.tags_in_menu), tag, self.subcoll.card(tag), self.subcoll.package_count()))
             max = max - 1
             if max == 0: break
 
@@ -103,14 +105,14 @@ class SmartSearcher:
         for pkg in self.subcoll.iter_packages():
             aptpkg = self.apt_cache[pkg]
             desc = aptpkg.raw_description.split("\n")[0]
-            print pkg, "-", desc
+            print(pkg, "-", desc)
 
     def interact(self):
         done = False
         while not done:
-            print "Tag selection:"
+            print("Tag selection:")
             self.show_tags()
-            print self.subcoll.package_count(), " packages selected so far."
+            print(self.subcoll.package_count(), " packages selected so far.")
 
             changed = False
 
@@ -123,19 +125,19 @@ class SmartSearcher:
             # If we're setting a new keyword search, process now and skip
             # processing as a list
             if ans == "?":
-                print "+ number  select the tag with the given number as a tag you want"
-                print "- number  select the tag with the given number as a tag you do not want"
-                print "= number  select the tag with the given number as a tag you don't care about"
-                print "K word    recompute the set of interesting tags from a full-text search using the given word"
-                print "V         view the packages selected so far"
-                print "D         print the packages selected so far and exit"
-                print "Q         quit debtags smart search"
-                print "?         print this help information"
+                print("+ number  select the tag with the given number as a tag you want")
+                print("- number  select the tag with the given number as a tag you do not want")
+                print("= number  select the tag with the given number as a tag you don't care about")
+                print("K word    recompute the set of interesting tags from a full-text search using the given word")
+                print("V         view the packages selected so far")
+                print("D         print the packages selected so far and exit")
+                print("Q         quit debtags smart search")
+                print("?         print this help information")
             elif ans[0] == 'k' or ans[0] == 'K':
                 # Strip initial command and empty spaces
                 ans = ans[1:].strip();
                 if len(ans) == 0:
-                    print "The 'k' command needs a keyword to use for finding new interesting tags."
+                    print("The 'k' command needs a keyword to use for finding new interesting tags.")
                 else:
                     self.compute_interesting(ans)
                 ans = ''
@@ -146,10 +148,10 @@ class SmartSearcher:
                         try:
                             idx = int(cmd[1:])
                         except ValueError:
-                            print cmd, "should have a number after +, - or ="
+                            print(cmd, "should have a number after +, - or =")
                             continue
                         if idx > len(self.tags_in_menu):
-                            print "Tag", idx, "was not on the menu."
+                            print("Tag", idx, "was not on the menu.")
                         else:
                             tag = self.tags_in_menu[idx - 1]
                             # cout << "Understood " << ans << " as " << ans[0] << tag.fullname() << endl;
@@ -175,13 +177,13 @@ class SmartSearcher:
                     elif cmd == "Q" or cmd == "q":
                         done = True;
                     else:
-                        print "Ignoring command \"%s\"" % (cmd)
+                        print("Ignoring command \"%s\"" % (cmd))
             if changed:
                 self.refilter()
 
 
 if len(sys.argv) < 3:
-    print sys.stderr, "Usage: %s tagdb keywords..." % (sys.argv[0])
+    print("Usage: %s tagdb keywords..." % (sys.argv[0]), file=sys.stderr)
     sys.exit(1)
 
 
diff --git a/examples/debtags/tagminer b/examples/debtags/tagminer
index ced2876..d5c2e97 100755
--- a/examples/debtags/tagminer
+++ b/examples/debtags/tagminer
@@ -9,6 +9,8 @@
 
 # Given a file, search Debian packages that can somehow handle it
 
+from __future__ import print_function
+
 import sys
 
 # Requires python-extractor, python-magic and python-debtags
@@ -125,7 +127,7 @@ if __name__ == '__main__':
     fullcoll.read(open(options.tagdb, "r"), lambda x: not tag_filter.match(x))
 
     type = mimetype(args[0])
-    #print >>sys.stderr, "Mime type:", type
+    #print("Mime type:", type, file=sys.stderr)
     found = set()
     for match, tags in mime_map:
         match = re.compile(match)
@@ -133,26 +135,26 @@ if __name__ == '__main__':
             for t in tags:
                 found.add(t)
     if len(found) == 0:
-        print >>sys.stderr, "Unhandled mime type:", type
+        print("Unhandled mime type:", type, file=sys.stderr)
     else:
         if options.action != None:
             apt_cache = apt.Cache()
             query = found.copy()
             query.add("role::program")
             query.add("use::"+options.action)
-            print "Debtags query:", " && ".join(query)
+            print("Debtags query:", " && ".join(query))
             subcoll = fullcoll.filter_packages_tags(lambda pt: query.issubset(pt[1]))
             for i in subcoll.iter_packages():
                 aptpkg = apt_cache[i]
                 desc = aptpkg.raw_description.split("\n")[0]
-                print i, "-", desc
+                print(i, "-", desc)
         else:
-            print "Debtags query:", " && ".join(found)
+            print("Debtags query:", " && ".join(found))
 
             query = found.copy()
             query.add("role::program")
             subcoll = fullcoll.filter_packages_tags(lambda pt: query.issubset(pt[1]))
             uses = map(lambda x:x[5:], filter(lambda x:x.startswith("use::"), subcoll.iter_tags()))
-            print "Available actions:", ", ".join(uses)
+            print("Available actions:", ", ".join(uses))
 
 # vim:set ts=4 sw=4:
diff --git a/examples/debtags/tagsbyrelevance b/examples/debtags/tagsbyrelevance
index 2633c12..d5ef8c2 100644
--- a/examples/debtags/tagsbyrelevance
+++ b/examples/debtags/tagsbyrelevance
@@ -18,12 +18,14 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
+from __future__ import print_function
+
 import sys
 import re
 from debian import debtags
 
 if len(sys.argv) < 2:
-	print sys.stderr, "Usage: %s tagdb [packagelist]" % (sys.argv[0])
+	print("Usage: %s tagdb [packagelist]" % (sys.argv[0]), file=sys.stderr)
 	sys.exit(1)
 
 full = debtags.DB()
@@ -48,5 +50,5 @@ tags = sorted(sub.iter_tags(), lambda a, b: cmp(rel_index(a), rel_index(b)))
 
 ## And finally print them
 for tag in tags:
-	print tag
-	#print tag, sub.card(tag), full.card(tag), float(sub.card(tag)) / float(full.card(tag))
+	print(tag)
+	#print(tag, sub.card(tag), full.card(tag), float(sub.card(tag)) / float(full.card(tag)))
diff --git a/examples/debtags/wxssearch b/examples/debtags/wxssearch
index a0aeb37..79ce0a1 100755
--- a/examples/debtags/wxssearch
+++ b/examples/debtags/wxssearch
@@ -18,6 +18,8 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
+from __future__ import print_function
+
 import wx
 import wx.html
 import wx.lib.dialogs
@@ -73,9 +75,9 @@ class Model(wx.EvtHandler):
         # Compute the most interesting tags by discriminance
         self.discriminant = sorted(self.subcoll.iter_tags(), \
                   lambda b, a: cmp(self.subcoll.discriminance(a), self.subcoll.discriminance(b)))
-        #print "-----------------------------"
+        #print("-----------------------------")
         #for d in self.discriminant:
-        #    print d, self.subcoll.discriminance(d)
+        #    print(d, self.subcoll.discriminance(d))
 
         # Notify the change
         e = Model.ModelEvent(Model.wxEVT_CHANGED)
@@ -168,7 +170,7 @@ class TagList(wx.html.HtmlWindow):
         self.ProcessEvent(e)
 
     def model_changed(self, event):
-        #print "TLMC"
+        #print("TLMC")
         self.SetPage("<html><body>")
         first = True
 
@@ -381,12 +383,12 @@ class SearchWindow(wx.Frame):
         if action == 'add':
             self.model.add_wanted(tag)
         elif action == 'addnot':
-            print "wanted_event -> addnot"
+            print("wanted_event -> addnot")
             self.model.add_unwanted(tag)
         elif action == 'del':
             self.model.remove_tag_from_filter(tag)
         else:
-            print "Unknown action", action
+            print("Unknown action", action)
 
     def go_button_pressed(self, event):
         self.model.set_query(self.query.GetValue())
-- 
1.7.8.3

>From 705f04c022357b4fa1d3cd61d274236dfef2f431 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 17:56:55 +0000
Subject: [PATCH 08/31] Use "key in dict" rather than obsolete
 "dict.has_key(key)".

---
 examples/deb822/grep-maintainer |    2 +-
 examples/debtags/tagminer       |    4 ++--
 tests/test_deb822.py            |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/examples/deb822/grep-maintainer b/examples/deb822/grep-maintainer
index 358bf7e..2775dda 100755
--- a/examples/deb822/grep-maintainer
+++ b/examples/deb822/grep-maintainer
@@ -26,6 +26,6 @@ except re.error as e:
     sys.exit(1)
 
 for pkg in deb822.Packages.iter_paragraphs(file('/var/lib/dpkg/status')):
-    if pkg.has_key('Maintainer') and maint_RE.search(pkg['maintainer']):
+    if 'Maintainer' in pkg and maint_RE.search(pkg['maintainer']):
         print(pkg['package'])
 
diff --git a/examples/debtags/tagminer b/examples/debtags/tagminer
index d5c2e97..ade8be6 100755
--- a/examples/debtags/tagminer
+++ b/examples/debtags/tagminer
@@ -91,13 +91,13 @@ def mimetype(fname):
     keys = extractor.extract(fname)
     xkeys = {}
     for k, v in keys:
-        if xkeys.has_key(k):
+        if k in xkeys:
             xkeys[k].append(v)
         else:
             xkeys[k] = [v]
     namemagic =  magic.file(fname)
     contentmagic = magic.buffer(file(fname, "r").read(4096))
-    return xkeys.has_key("mimetype") and xkeys['mimetype'][0] or contentmagic or namemagic
+    return "mimetype" in xkeys and xkeys['mimetype'][0] or contentmagic or namemagic
 
 
 class Parser(OptionParser):
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 39249b0..1e6387f 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -671,7 +671,7 @@ Description: python modules to work with Debian-related data formats
         self.assertEqual(input2, d2.dump())
 
         d3 = deb822.Deb822()
-        if not d3.has_key('some-test-key'):
+        if 'some-test-key' not in d3:
             d3['Some-Test-Key'] = 'some value'
         self.assertEqual(d3.dump(), "Some-Test-Key: some value\n")
 
-- 
1.7.8.3

>From 509e1b2ee2fd434beb357b3f307c3699ebe36b67 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 18:03:00 +0000
Subject: [PATCH 09/31] Use open() rather than file(); file() does not exist
 in Python 3.

---
 README.deb822                           |   12 ++++++------
 examples/deb822/depgraph                |    2 +-
 examples/deb822/grep-maintainer         |    2 +-
 examples/deb822/grep_native_packages.py |    2 +-
 examples/debfile/extract_cron           |    2 +-
 examples/debtags/tagminer               |    2 +-
 lib/debian/deb822.py                    |    7 ++++---
 lib/debian/debian_support.py            |    6 +++---
 tests/test_deb822.py                    |   26 +++++++++++++-------------
 9 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/README.deb822 b/README.deb822
index 104ec0f..89e7d64 100644
--- a/README.deb822
+++ b/README.deb822
@@ -47,7 +47,7 @@ Here is a list of the types deb822 knows about:
 Input
 =====
 
-Deb822 objects are normally initialized from a file() object, from which
+Deb822 objects are normally initialized from a file object, from which
 at most one paragraph is read, or a string.
 
 Alternatively, any sequence that returns one line of input at a time may
@@ -63,10 +63,9 @@ All classes provide an "iter_paragraphs" class method to easily go over
 each stanza in a file with multiple entries, like a Packages.gz file.
 For example:
 
-    f = file('/mirror/debian/dists/sid/main/binary-i386/Sources') 
-
-    for src in Sources.iter_paragraphs(f):
-	print src['Package'], src['Version']
+    with open('/mirror/debian/dists/sid/main/binary-i386/Sources') as f:
+        for src in Sources.iter_paragraphs(f):
+            print src['Package'], src['Version']
 
 This method uses python-apt if available to parse the file, since it
 significantly boosts performance. The downside, though, is that yielded
@@ -80,7 +79,8 @@ Sample usage (TODO: Improve)
 
    import deb822 
 
-   d = deb822.dsc(file('foo_1.1.dsc'))
+   with open('foo_1.1.dsc') as f:
+       d = deb822.Dsc(f)
    source = d['Source']
    version = d['Version']
 
diff --git a/examples/deb822/depgraph b/examples/deb822/depgraph
index 388d564..4379140 100755
--- a/examples/deb822/depgraph
+++ b/examples/deb822/depgraph
@@ -37,7 +37,7 @@ def main():
         print('  "%s" [label="%s"] ;' % (node, dsc))
 
     print("digraph depgraph {")
-    for pkg in deb822.Packages.iter_paragraphs(file(sys.argv[1])):
+    for pkg in deb822.Packages.iter_paragraphs(open(sys.argv[1])):
         name = pkg['package']
         rels = pkg.relations
         for deps in rels['depends']:
diff --git a/examples/deb822/grep-maintainer b/examples/deb822/grep-maintainer
index 2775dda..b752e2c 100755
--- a/examples/deb822/grep-maintainer
+++ b/examples/deb822/grep-maintainer
@@ -25,7 +25,7 @@ except re.error as e:
     print("Error in the regexp: %s" % (e,), file=sys.stderr)
     sys.exit(1)
 
-for pkg in deb822.Packages.iter_paragraphs(file('/var/lib/dpkg/status')):
+for pkg in deb822.Packages.iter_paragraphs(open('/var/lib/dpkg/status')):
     if 'Maintainer' in pkg and maint_RE.search(pkg['maintainer']):
         print(pkg['package'])
 
diff --git a/examples/deb822/grep_native_packages.py b/examples/deb822/grep_native_packages.py
index 9a3db74..7d478fb 100755
--- a/examples/deb822/grep_native_packages.py
+++ b/examples/deb822/grep_native_packages.py
@@ -13,7 +13,7 @@ import sys
 from debian import deb822
 
 for fname in sys.argv[1:]:
-    f = file(fname)
+    f = open(fname)
     for stanza in deb822.Sources.iter_paragraphs(f):
         pieces = stanza['version'].split('-')
         if len(pieces) < 2:
diff --git a/examples/debfile/extract_cron b/examples/debfile/extract_cron
index 0b0ef49..88f0631 100755
--- a/examples/debfile/extract_cron
+++ b/examples/debfile/extract_cron
@@ -33,7 +33,7 @@ if __name__ == '__main__':
             dir = os.path.dirname(path)
             if not os.path.exists(dir):
                 os.mkdir(dir)
-            out = file(path, 'w')
+            out = open(path, 'w')
             out.write(deb.data.get_content(cron_file))
             out.close()
 
diff --git a/examples/debtags/tagminer b/examples/debtags/tagminer
index ade8be6..470fa33 100755
--- a/examples/debtags/tagminer
+++ b/examples/debtags/tagminer
@@ -96,7 +96,7 @@ def mimetype(fname):
         else:
             xkeys[k] = [v]
     namemagic =  magic.file(fname)
-    contentmagic = magic.buffer(file(fname, "r").read(4096))
+    contentmagic = magic.buffer(open(fname, "r").read(4096))
     return "mimetype" in xkeys and xkeys['mimetype'][0] or contentmagic or namemagic
 
 
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index 6db213a..bbadcb3 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -265,7 +265,7 @@ class Deb822(Deb822Dict):
         """Create a new Deb822 instance.
 
         :param sequence: a string, or any any object that returns a line of
-            input each time, normally a file().  Alternately, sequence can
+            input each time, normally a file.  Alternately, sequence can
             be a dict that contains the initial key-value pairs.
 
         :param fields: if given, it is interpreted as a list of fields that
@@ -302,7 +302,7 @@ class Deb822(Deb822Dict):
 
         :param fields: likewise.
 
-        :param use_apt_pkg: if sequence is a file(), apt_pkg will be used 
+        :param use_apt_pkg: if sequence is a file, apt_pkg will be used
             if available to parse the file, since it's much much faster.  Set
             this parameter to False to disable using apt_pkg.
         :param shared_storage: not used, here for historical reasons.  Deb822
@@ -774,7 +774,8 @@ class GpgInfo(dict):
 
         See GpgInfo.from_sequence.
         """
-        return cls.from_sequence(file(target), *args, **kwargs)
+        with open(target) as target_file:
+            return cls.from_sequence(target_file, *args, **kwargs)
 
 
 class PkgRelation(object):
diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index f8dc1c1..864f016 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -284,7 +284,7 @@ class PackageFile:
                   file with the indicated name.
         """
         if file_obj is None:
-            file_obj = file(name)
+            file_obj = open(name)
         self.name = name
         self.file = file_obj
         self.lineno = 0
@@ -438,7 +438,7 @@ def replace_file(lines, local):
     import os.path
 
     local_new = local + '.new'
-    new_file = file(local_new, 'w+')
+    new_file = open(local_new, 'w+')
 
     try:
         for l in lines:
@@ -496,7 +496,7 @@ def update_file(remote, local, verbose=None):
     """
 
     try:
-        local_file = file(local)
+        local_file = open(local)
     except IOError:
         if verbose:
             print("update_file: no local copy, downloading full file")
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 1e6387f..0ad5e75 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -686,11 +686,11 @@ Description: python modules to work with Debian-related data formats
         objects = []
         objects.append(deb822.Deb822(UNPARSED_PACKAGE))
         objects.append(deb822.Deb822(CHANGES_FILE))
-        objects.extend(deb822.Deb822.iter_paragraphs(file('test_Packages')))
-        objects.extend(deb822.Packages.iter_paragraphs(file('test_Packages')))
-        objects.extend(deb822.Deb822.iter_paragraphs(file('test_Sources')))
+        objects.extend(deb822.Deb822.iter_paragraphs(open('test_Packages')))
+        objects.extend(deb822.Packages.iter_paragraphs(open('test_Packages')))
+        objects.extend(deb822.Deb822.iter_paragraphs(open('test_Sources')))
         objects.extend(deb822.Deb822.iter_paragraphs(
-                         file('test_Sources.iso8859-1'), encoding="iso8859-1"))
+                         open('test_Sources.iso8859-1'), encoding="iso8859-1"))
         for d in objects:
             for value in d.values():
                 self.assert_(isinstance(value, unicode))
@@ -701,16 +701,16 @@ Description: python modules to work with Debian-related data formats
         multi.append(deb822.Changes(CHANGES_FILE))
         multi.append(deb822.Changes(SIGNED_CHECKSUM_CHANGES_FILE
                                     % CHECKSUM_CHANGES_FILE))
-        multi.extend(deb822.Sources.iter_paragraphs(file('test_Sources')))
+        multi.extend(deb822.Sources.iter_paragraphs(open('test_Sources')))
         for d in multi:
             for key, value in d.items():
                 if key.lower() not in d.__class__._multivalued_fields:
                     self.assert_(isinstance(value, unicode))
 
     def test_encoding_integrity(self):
-        utf8 = list(deb822.Deb822.iter_paragraphs(file('test_Sources')))
+        utf8 = list(deb822.Deb822.iter_paragraphs(open('test_Sources')))
         latin1 = list(deb822.Deb822.iter_paragraphs(
-                                                file('test_Sources.iso8859-1'),
+                                                open('test_Sources.iso8859-1'),
                                                 encoding='iso8859-1'))
 
         # dump() with no fd returns a unicode object - both should be identical
@@ -721,9 +721,9 @@ Description: python modules to work with Debian-related data formats
         # XXX: The way multiline fields parsing works, we can't guarantee
         # that trailing whitespace is reproduced.
         utf8_contents = "\n".join([line.rstrip() for line in
-                                   file('test_Sources')] + [''])
+                                   open('test_Sources')] + [''])
         latin1_contents = "\n".join([line.rstrip() for line in
-                                     file('test_Sources.iso8859-1')] + [''])
+                                     open('test_Sources.iso8859-1')] + [''])
 
         utf8_to_latin1 = StringIO()
         for d in utf8:
@@ -751,8 +751,8 @@ Description: python modules to work with Debian-related data formats
         warnings.filterwarnings(action='ignore', category=UnicodeWarning)
 
         filename = 'test_Sources.mixed_encoding'
-        for paragraphs in [deb822.Sources.iter_paragraphs(file(filename)),
-                           deb822.Sources.iter_paragraphs(file(filename),
+        for paragraphs in [deb822.Sources.iter_paragraphs(open(filename)),
+                           deb822.Sources.iter_paragraphs(open(filename),
                                                           use_apt_pkg=False)]:
             p1 = paragraphs.next()
             self.assertEqual(p1['maintainer'],
@@ -816,7 +816,7 @@ Description: python modules to work with Debian-related data formats
 class TestPkgRelations(unittest.TestCase):
 
     def test_packages(self):
-        pkgs = deb822.Packages.iter_paragraphs(file('test_Packages'))
+        pkgs = deb822.Packages.iter_paragraphs(open('test_Packages'))
         pkg1 = pkgs.next()
         rel1 = {'breaks': [],
                 'conflicts': [],
@@ -891,7 +891,7 @@ class TestPkgRelations(unittest.TestCase):
                             src_rel)))
 
     def test_sources(self):
-        pkgs = deb822.Sources.iter_paragraphs(file('test_Sources'))
+        pkgs = deb822.Sources.iter_paragraphs(open('test_Sources'))
         pkg1 = pkgs.next()
         rel1 = {'build-conflicts': [],
                 'build-conflicts-indep': [],
-- 
1.7.8.3

>From 7ac64f9393c6abce4cb0e9e8496d36ab4def0913 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 18:07:14 +0000
Subject: [PATCH 10/31] Use sep.join(list) rather than string.join(list, sep).

---
 examples/debfile/changelog_head |    3 +--
 examples/debfile/dpkg-info      |    3 +--
 lib/debian/deb822.py            |    7 +++----
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/examples/debfile/changelog_head b/examples/debfile/changelog_head
index c8a5a4d..8489244 100755
--- a/examples/debfile/changelog_head
+++ b/examples/debfile/changelog_head
@@ -11,7 +11,6 @@
 """Like "head" for changelog entries, return last n-th entries of the changelog
 shipped in a .deb file."""
 
-import string
 import sys
 
 from debian import debfile
@@ -31,5 +30,5 @@ if __name__ == '__main__':
     deb = debfile.DebFile(sys.argv[1])
     chg = deb.changelog()
     entries = chg._blocks[:entries]
-    print(string.join(map(str, entries)))
+    print(''.join(map(str, entries)))
 
diff --git a/examples/debfile/dpkg-info b/examples/debfile/dpkg-info
index 3921a72..ee23970 100755
--- a/examples/debfile/dpkg-info
+++ b/examples/debfile/dpkg-info
@@ -14,7 +14,6 @@ class. """
 
 import os
 import stat
-import string
 import sys
 
 from debian import debfile
@@ -47,7 +46,7 @@ if __name__ == '__main__':
         if n.lower() == 'description':  # increase indentation of long dsc
             lines = v.split('\n')
             short_dsc = lines[0]
-            long_dsc = string.join(map(lambda l: ' ' + l, lines[1:]), '\n')
+            long_dsc = '\n'.join(map(lambda l: ' ' + l, lines[1:]))
             print(' %s: %s\n%s' % (n, short_dsc, long_dsc))
         else:
             print(' %s: %s' % (n, v))
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index bbadcb3..c9f1c36 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -37,7 +37,6 @@ except (ImportError, AttributeError):
 import chardet
 import os
 import re
-import string
 import subprocess
 import sys
 import warnings
@@ -853,11 +852,11 @@ class PkgRelation(object):
             if dep.get('version') is not None:
                 s += ' (%s %s)' % dep['version']
             if dep.get('arch') is not None:
-                s += ' [%s]' % string.join(map(pp_arch, dep['arch']))
+                s += ' [%s]' % ' '.join(map(pp_arch, dep['arch']))
             return s
 
-        pp_or_dep = lambda deps: string.join(map(pp_atomic_dep, deps), ' | ')
-        return string.join(map(pp_or_dep, rels), ', ')
+        pp_or_dep = lambda deps: ' | '.join(map(pp_atomic_dep, deps))
+        return ', '.join(map(pp_or_dep, rels))
 
 
 class _lowercase_dict(dict):
-- 
1.7.8.3

>From 03ae9d71a454e3594173a37a0fa70d38d3fa89cc Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 18:10:48 +0000
Subject: [PATCH 11/31] Implement rich comparison methods (the only kind
 available in Python 3) rather than __cmp__.

---
 lib/debian/debian_support.py |   70 +++++++++++++++++++++++++++++++++---------
 tests/test_deb822.py         |    2 +-
 2 files changed, 56 insertions(+), 16 deletions(-)

diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index 864f016..a51dbd8 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -156,9 +156,27 @@ class BaseVersion(object):
     def __repr__(self):
         return "%s('%s')" % (self.__class__.__name__, self)
 
-    def __cmp__(self, other):
+    def _compare(self, other):
         raise NotImplementedError
 
+    def __lt__(self, other):
+        return self._compare(other) < 0
+
+    def __le__(self, other):
+        return self._compare(other) <= 0
+
+    def __eq__(self, other):
+        return self._compare(other) == 0
+
+    def __ne__(self, other):
+        return self._compare(other) != 0
+
+    def __ge__(self, other):
+        return self._compare(other) >= 0
+
+    def __gt__(self, other):
+        return self._compare(other) > 0
+
     def __hash__(self):
         return hash(str(self))
 
@@ -171,7 +189,7 @@ class AptPkgVersion(BaseVersion):
                                       "python-apt package")
         super(AptPkgVersion, self).__init__(version)
 
-    def __cmp__(self, other):
+    def _compare(self, other):
         return apt_pkg.version_compare(str(self), str(other))
 
 # NativeVersion based on the DpkgVersion class by Raphael Hertzog in
@@ -184,7 +202,7 @@ class NativeVersion(BaseVersion):
     re_digit = re.compile("\d")
     re_alpha = re.compile("[A-Za-z]")
 
-    def __cmp__(self, other):
+    def _compare(self, other):
         # Convert other into an instance of BaseVersion if it's not already.
         # (All we need is epoch, upstream_version, and debian_revision
         # attributes, which BaseVersion gives us.) Requires other's string
@@ -196,9 +214,12 @@ class NativeVersion(BaseVersion):
                 raise ValueError("Couldn't convert %r to BaseVersion: %s"
                                  % (other, e))
 
-        res = cmp(int(self.epoch or "0"), int(other.epoch or "0"))
-        if res != 0:
-            return res
+        lepoch = int(self.epoch or "0")
+        repoch = int(other.epoch or "0")
+        if lepoch < repoch:
+            return -1
+        elif lepoch > repoch:
+            return 1
         res = self._version_cmp_part(self.upstream_version,
                                      other.upstream_version)
         if res != 0:
@@ -229,9 +250,10 @@ class NativeVersion(BaseVersion):
                 a = la.pop(0)
             if lb:
                 b = lb.pop(0)
-            res = cmp(a, b)
-            if res != 0:
-                return res
+            if a < b:
+                return -1
+            elif a > b:
+                return 1
         return 0
 
     @classmethod
@@ -248,9 +270,10 @@ class NativeVersion(BaseVersion):
             if cls.re_digits.match(a) and cls.re_digits.match(b):
                 a = int(a)
                 b = int(b)
-                res = cmp(a, b)
-                if res != 0:
-                    return res
+                if a < b:
+                    return -1
+                elif a > b:
+                    return 1
             else:
                 res = cls._version_cmp_string(a, b)
                 if res != 0:
@@ -265,7 +288,14 @@ else:
         pass
 
 def version_compare(a, b):
-    return cmp(Version(a), Version(b))
+    va = Version(a)
+    vb = Version(b)
+    if va < vb:
+        return -1
+    elif va > vb:
+        return 1
+    else:
+        return 0
 
 class PackageFile:
     """A Debian package file.
@@ -340,8 +370,18 @@ class PseudoEnum:
         return '%s(%s)'% (self.__class__._name__, repr(name))
     def __str__(self):
         return self._name
-    def __cmp__(self, other):
-        return cmp(self._order, other._order)
+    def __lt__(self, other):
+        return self._order < other._order
+    def __le__(self, other):
+        return self._order <= other._order
+    def __eq__(self, other):
+        return self._order == other._order
+    def __ne__(self, other):
+        return self._order != other._order
+    def __ge__(self, other):
+        return self._order >= other._order
+    def __gt__(self, other):
+        return self._order > other._order
     def __hash__(self):
         return hash(self._order)
 
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 0ad5e75..3176f3c 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -336,7 +336,7 @@ class TestDeb822(unittest.TestCase):
 
         for k, v in dict_.items():
             self.assertEqual(v, deb822_[k])
-        self.assertEqual(0, deb822_.__cmp__(dict_))
+        self.assertEqual(deb822_, dict_)
 
     def gen_random_string(length=20):
         from random import choice
-- 
1.7.8.3

>From 444349431f0cf600b50f669a98353b2eb4e4d820 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 18:18:39 +0000
Subject: [PATCH 12/31] Use assertTrue and assertEquals rather than deprecated
 assert_ and assertEqual.

---
 tests/test_changelog.py      |    2 +-
 tests/test_deb822.py         |   31 ++++++++++++++++---------------
 tests/test_debian_support.py |    2 +-
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/tests/test_changelog.py b/tests/test_changelog.py
index b1b0067..fe92817 100755
--- a/tests/test_changelog.py
+++ b/tests/test_changelog.py
@@ -201,7 +201,7 @@ haskell-src-exts (1.8.2-2) unstable; urgency=low
  -- Marco Túlio Gontijo e Silva <marcot@debian.org>  Tue, 16 Mar 2010 10:59:48 -0300
 """
         self.assertEqual(u, expected_u)
-        self.assertEquals(str(c), u.encode('utf-8'))
+        self.assertEqual(str(c), u.encode('utf-8'))
 
     def test_unicode_object_input(self):
         f = open('test_changelog_unicode')
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 3176f3c..8cf2f7e 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -489,10 +489,10 @@ class TestDeb822(unittest.TestCase):
         wanted_fields = [ 'Package', 'MD5sum', 'Filename', 'Description' ]
         deb822_ = deb822.Deb822(UNPARSED_PACKAGE.splitlines(), wanted_fields)
 
-        self.assertEquals(sorted(wanted_fields), sorted(deb822_.keys()))
+        self.assertEqual(sorted(wanted_fields), sorted(deb822_.keys()))
 
         for key in wanted_fields:
-            self.assertEquals(PARSED_PACKAGE[key], deb822_[key])
+            self.assertEqual(PARSED_PACKAGE[key], deb822_[key])
 
     def test_iter_paragraphs_limit_fields(self):
         wanted_fields = [ 'Package', 'MD5sum', 'Filename', 'Tag' ]
@@ -500,10 +500,10 @@ class TestDeb822(unittest.TestCase):
         for deb822_ in deb822.Deb822.iter_paragraphs(
                 UNPARSED_PACKAGE.splitlines(), wanted_fields):
 
-            self.assertEquals(sorted(wanted_fields), sorted(deb822_.keys()))
+            self.assertEqual(sorted(wanted_fields), sorted(deb822_.keys()))
 
             for key in wanted_fields:
-                self.assertEquals(PARSED_PACKAGE[key], deb822_[key])
+                self.assertEqual(PARSED_PACKAGE[key], deb822_[key])
 
     def test_dont_assume_trailing_newline(self):
         deb822a = deb822.Deb822(['Package: foo'])
@@ -546,7 +546,7 @@ class TestDeb822(unittest.TestCase):
         dict_['Multiline-Field'] = 'a\n b\n c' # XXX should be 'a\nb\nc'?
 
         for k, v in deb822_.items():
-            self.assertEquals(dict_[k], v)
+            self.assertEqual(dict_[k], v)
     
     def test_case_insensitive(self):
         # PARSED_PACKAGE is a deb822.Deb822Dict object, so we can test
@@ -585,9 +585,10 @@ class TestDeb822(unittest.TestCase):
         for cls in deb822.Deb822, deb822.Changes:
             parsed = cls(CHANGES_FILE.splitlines())
             for line in parsed.dump().splitlines():
-                self.assert_(bad_re.match(line) is None,
-                            "There should not be trailing whitespace after the "
-                            "colon in a multiline field starting with a newline")
+                self.assertTrue(bad_re.match(line) is None,
+                                "There should not be trailing whitespace "
+                                "after the colon in a multiline field "
+                                "starting with a newline")
 
         
         control_paragraph = """Package: python-debian
@@ -612,10 +613,10 @@ Description: python modules to work with Debian-related data formats
         field_with_space_re = re.compile(r"^\S+: ")
         for line in parsed_control.dump().splitlines():
             if field_re.match(line):
-                self.assert_(field_with_space_re.match(line),
-                             "Multiline fields that do not start with newline "
-                             "should have a space between the colon and the "
-                             "beginning of the value")
+                self.assertTrue(field_with_space_re.match(line),
+                                "Multiline fields that do not start with "
+                                "newline should have a space between the "
+                                "colon and the beginning of the value")
 
     def test_blank_value(self):
         """Fields with blank values are parsable--so they should be dumpable"""
@@ -641,7 +642,7 @@ Description: python modules to work with Debian-related data formats
         d['Bar'] = 'baz'
         d_copy = d.copy()
 
-        self.assert_(isinstance(d_copy, deb822.Deb822))
+        self.assertTrue(isinstance(d_copy, deb822.Deb822))
         expected_dump = "Foo: bar\nBar: baz\n"
         self.assertEqual(d_copy.dump(), expected_dump)
 
@@ -693,7 +694,7 @@ Description: python modules to work with Debian-related data formats
                          open('test_Sources.iso8859-1'), encoding="iso8859-1"))
         for d in objects:
             for value in d.values():
-                self.assert_(isinstance(value, unicode))
+                self.assertTrue(isinstance(value, unicode))
 
         # The same should be true for Sources and Changes except for their
         # _multivalued fields
@@ -705,7 +706,7 @@ Description: python modules to work with Debian-related data formats
         for d in multi:
             for key, value in d.items():
                 if key.lower() not in d.__class__._multivalued_fields:
-                    self.assert_(isinstance(value, unicode))
+                    self.assertTrue(isinstance(value, unicode))
 
     def test_encoding_integrity(self):
         utf8 = list(deb822.Deb822.iter_paragraphs(open('test_Sources')))
diff --git a/tests/test_debian_support.py b/tests/test_debian_support.py
index ab60ba9..1884fc3 100755
--- a/tests/test_debian_support.py
+++ b/tests/test_debian_support.py
@@ -174,7 +174,7 @@ class ReleaseTests(unittest.TestCase):
     """Tests for debian_support.Release"""
 
     def test_comparison(self):
-        self.assert_(intern_release('sarge') < intern_release('etch'))
+        self.assertTrue(intern_release('sarge') < intern_release('etch'))
 
 
 class HelperRoutineTests(unittest.TestCase):
-- 
1.7.8.3

>From 63e56289cd4829110f7c0a54f16c593203837c2e Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 23:57:39 +0000
Subject: [PATCH 13/31] Try to import pickle if importing cPickle fails. 
 Python 3 only has pickle.

---
 lib/debian/debtags.py |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/lib/debian/debtags.py b/lib/debian/debtags.py
index d3df6f7..f13bead 100644
--- a/lib/debian/debtags.py
+++ b/lib/debian/debtags.py
@@ -15,7 +15,11 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
-import re, cPickle
+import re
+try:
+    import cPickle as pickle
+except ImportError:
+    import pickle
 
 from debian.deprecation import function_deprecated_by
 
@@ -154,13 +158,13 @@ class DB:
 
 	def qwrite(self, file):
 		"Quickly write the data to a pickled file"
-		cPickle.dump(self.db, file)
-		cPickle.dump(self.rdb, file)
+		pickle.dump(self.db, file)
+		pickle.dump(self.rdb, file)
 
 	def qread(self, file):
 		"Quickly read the data from a pickled file"
-		self.db = cPickle.load(file)
-		self.rdb = cPickle.load(file)
+		self.db = pickle.load(file)
+		self.rdb = pickle.load(file)
 
 	def insert(self, pkg, tags):
 		self.db[pkg] = tags.copy()
-- 
1.7.8.3

>From 3dd78443bdb3f7ec47997007a2c4681cf0fa2d2a Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:02:58 +0000
Subject: [PATCH 14/31] Use io.StringIO if StringIO.StringIO is absent (as in
 Python 3).

---
 lib/debian/deb822.py |   11 +++++++----
 tests/test_deb822.py |    5 ++++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index c9f1c36..fe6026e 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -41,7 +41,10 @@ import subprocess
 import sys
 import warnings
 
-import StringIO
+try:
+    from StringIO import StringIO
+except ImportError:
+    from io import StringIO
 import UserDict
 
 
@@ -427,7 +430,7 @@ class Deb822(Deb822Dict):
         """
 
         if fd is None:
-            fd = StringIO.StringIO()
+            fd = StringIO()
             return_string = True
         else:
             return_string = False
@@ -997,7 +1000,7 @@ class _multivalued(Deb822):
     def get_as_string(self, key):
         keyl = key.lower()
         if keyl in self._multivalued_fields:
-            fd = StringIO.StringIO()
+            fd = StringIO()
             if hasattr(self[key], 'keys'): # single-line
                 array = [ self[key] ]
             else: # multi-line
@@ -1061,7 +1064,7 @@ class _gpg_multivalued(_multivalued):
                     # Empty input
                     gpg_pre_lines = lines = gpg_post_lines = []
                 if gpg_pre_lines and gpg_post_lines:
-                    raw_text = StringIO.StringIO()
+                    raw_text = StringIO()
                     raw_text.write("\n".join(gpg_pre_lines))
                     raw_text.write("\n\n")
                     raw_text.write("\n".join(lines))
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 8cf2f7e..da90bea 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -23,7 +23,10 @@ import sys
 import tempfile
 import unittest
 import warnings
-from StringIO import StringIO
+try:
+    from StringIO import StringIO
+except ImportError:
+    from io import StringIO
 
 sys.path.insert(0, '../lib/')
 
-- 
1.7.8.3

>From fa95f3eeeb28d997c99843f9d67af9b2820b16df Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:07:50 +0000
Subject: [PATCH 15/31] Use collections.Mapping/collections.MutableMapping
 instead of UserDict.DictMixin if available.

---
 lib/debian/deb822.py |   47 +++++++++++++++++++++++++++++++++--------------
 1 files changed, 33 insertions(+), 14 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index fe6026e..b3af37f 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -45,14 +45,21 @@ try:
     from StringIO import StringIO
 except ImportError:
     from io import StringIO
-import UserDict
+try:
+    from collections import Mapping, MutableMapping
+    _mapping_mixin = Mapping
+    _mutable_mapping_mixin = MutableMapping
+except ImportError:
+    from UserDict import DictMixin
+    _mapping_mixin = DictMixin
+    _mutable_mapping_mixin = DictMixin
 
 
 GPGV_DEFAULT_KEYRINGS = frozenset(['/usr/share/keyrings/debian-keyring.gpg'])
 GPGV_EXECUTABLE = '/usr/bin/gpgv'
 
 
-class TagSectionWrapper(object, UserDict.DictMixin):
+class TagSectionWrapper(_mapping_mixin, object):
     """Wrap a TagSection object, using its find_raw method to get field values
 
     This allows us to pick which whitespace to strip off the beginning and end
@@ -62,9 +69,14 @@ class TagSectionWrapper(object, UserDict.DictMixin):
     def __init__(self, section):
         self.__section = section
 
-    def keys(self):
-        return [key for key in self.__section.keys()
-                if not key.startswith('#')]
+    def __iter__(self):
+        for key in self.__section.keys():
+            if not key.startswith('#'):
+                yield key
+
+    def __len__(self):
+        return len([key for key in self.__section.keys()
+                    if not key.startswith('#')])
 
     def __getitem__(self, key):
         s = self.__section.find_raw(key)
@@ -111,6 +123,9 @@ class OrderedSet(object):
         # Return an iterator of items in the order they were added
         return iter(self.__order)
 
+    def __len__(self):
+        return len(self.__order)
+
     def __contains__(self, item):
         # This is what makes OrderedSet faster than using a list to keep track
         # of keys.  Lookup in a set is O(1) instead of O(n) for a list.
@@ -125,10 +140,10 @@ class OrderedSet(object):
     ###
 
 
-class Deb822Dict(object, UserDict.DictMixin):
-    # Subclassing UserDict.DictMixin because we're overriding so much dict
-    # functionality that subclassing dict requires overriding many more than
-    # the four methods that DictMixin requires.
+class Deb822Dict(_mutable_mapping_mixin, object):
+    # Subclassing _mutable_mapping_mixin because we're overriding so much
+    # dict functionality that subclassing dict requires overriding many more
+    # than the methods that _mutable_mapping_mixin requires.
     """A dictionary-like object suitable for storing RFC822-like data.
 
     Deb822Dict behaves like a normal dict, except:
@@ -175,7 +190,14 @@ class Deb822Dict(object, UserDict.DictMixin):
             else:
                 self.__keys.extend([ _strI(f) for f in _fields if f in self.__parsed ])
         
-    ### BEGIN DictMixin methods
+    ### BEGIN _mutable_mapping_mixin methods
+
+    def __iter__(self):
+        for key in self.__keys:
+            yield str(key)
+
+    def __len__(self):
+        return len(self.__keys)
 
     def __setitem__(self, key, value):
         key = _strI(key)
@@ -229,10 +251,7 @@ class Deb822Dict(object, UserDict.DictMixin):
         key = _strI(key)
         return key in self.__keys
     
-    def keys(self):
-        return [str(key) for key in self.__keys]
-    
-    ### END DictMixin methods
+    ### END _mutable_mapping_mixin methods
 
     def __repr__(self):
         return '{%s}' % ', '.join(['%r: %r' % (k, v) for k, v in self.items()])
-- 
1.7.8.3

>From 2056f133abc0eb6b790f7bdae61b01103393103b Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:12:57 +0000
Subject: [PATCH 16/31] Use list comprehensions instead of map where a list is
 required.  In Python 3, map returns an iterator, not
 a list.

---
 lib/debian/deb822.py         |    2 +-
 tests/test_changelog.py      |    2 +-
 tests/test_debfile.py        |    9 ++++-----
 tests/test_debian_support.py |    2 +-
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index b3af37f..c66c49f 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -853,7 +853,7 @@ class PkgRelation(object):
 
         tl_deps = cls.__comma_sep_RE.split(raw.strip()) # top-level deps
         cnf = map(cls.__pipe_sep_RE.split, tl_deps)
-        return map(lambda or_deps: map(parse_rel, or_deps), cnf)
+        return [[parse_rel(or_dep) for or_dep in or_deps] for or_deps in cnf]
 
     @staticmethod
     def str(rels):
diff --git a/tests/test_changelog.py b/tests/test_changelog.py
index fe92817..361b0ac 100755
--- a/tests/test_changelog.py
+++ b/tests/test_changelog.py
@@ -228,7 +228,7 @@ haskell-src-exts (1.8.2-2) unstable; urgency=low
         f = open('test_changelog')
         c = changelog.Changelog(f)
         f.close()
-        self.assertEqual(map(str, c._blocks), map(str, c))
+        self.assertEqual([str(b) for b in c._blocks], [str(b) for b in c])
 
     def test_len(self):
         f = open('test_changelog')
diff --git a/tests/test_debfile.py b/tests/test_debfile.py
index c594cf0..c677275 100755
--- a/tests/test_debfile.py
+++ b/tests/test_debfile.py
@@ -142,11 +142,10 @@ class TestDebFile(unittest.TestCase):
     def test_data_names(self):
         """ test for file list equality """ 
         tgz = self.d.data.tgz()
-        dpkg_names = map(os.path.normpath,
-                [ x.strip() for x in
-                    os.popen("dpkg-deb --fsys-tarfile %s | tar t" %
-                        self.debname).readlines() ])
-        debfile_names = map(os.path.normpath, tgz.getnames())
+        with os.popen("dpkg-deb --fsys-tarfile %s | tar t" %
+                      self.debname) as tar:
+            dpkg_names = [os.path.normpath(x.strip()) for x in tar.readlines()]
+        debfile_names = [os.path.normpath(name) for name in tgz.getnames()]
         
         # skip the root
         self.assertEqual(debfile_names[1:], dpkg_names[1:])
diff --git a/tests/test_debian_support.py b/tests/test_debian_support.py
index 1884fc3..1da7b24 100755
--- a/tests/test_debian_support.py
+++ b/tests/test_debian_support.py
@@ -187,7 +187,7 @@ class HelperRoutineTests(unittest.TestCase):
                          '14293c9bd646a15dc656eaf8fba95124020dfada')
 
     def test_patch_lines(self):
-        file_a = map(lambda x: "%d\n" % x, range(1, 18))
+        file_a = ["%d\n" % x for x in range(1, 18)]
         file_b = ['0\n', '1\n', '<2>\n', '<3>\n', '4\n', '5\n', '7\n', '8\n',
                   '11\n', '12\n', '<13>\n', '14\n', '15\n', 'A\n', 'B\n',
                   'C\n', '16\n', '17\n',]
-- 
1.7.8.3

>From 3ef4053618254c7d1301fca905619ea591a05a84 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:16:05 +0000
Subject: [PATCH 17/31] If StandardError does not exist (as in Python 3),
 inherit changelog exception classes from Exception
 instead.

---
 lib/debian/changelog.py |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/debian/changelog.py b/lib/debian/changelog.py
index d6a8f7a..df2ccdf 100644
--- a/lib/debian/changelog.py
+++ b/lib/debian/changelog.py
@@ -31,7 +31,14 @@ import warnings
 
 from debian import debian_support
 
-class ChangelogParseError(StandardError):
+# Python 3 doesn't have StandardError, but let's avoid changing our
+# exception inheritance hierarchy for Python 2.
+try:
+    _base_exception_class = StandardError
+except NameError:
+    _base_exception_class = Exception
+
+class ChangelogParseError(_base_exception_class):
     """Indicates that the changelog could not be parsed"""
     is_user_error = True
 
@@ -41,11 +48,11 @@ class ChangelogParseError(StandardError):
     def __str__(self):
         return "Could not parse changelog: "+self._line
 
-class ChangelogCreateError(StandardError):
+class ChangelogCreateError(_base_exception_class):
     """Indicates that changelog could not be created, as all the information
     required was not given"""
 
-class VersionError(StandardError):
+class VersionError(_base_exception_class):
     """Indicates that the version does not conform to the required format"""
 
     is_user_error = True
-- 
1.7.8.3

>From f64bd58c983619c1e598219bf88c5cd1d62285d2 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:22:18 +0000
Subject: [PATCH 18/31] Use six to paper over dict iteration differences
 between Python 2 and 3.

---
 debian/control               |    4 ++--
 examples/deb822/render-dctrl |    3 ++-
 examples/debfile/dpkg-info   |    4 +++-
 examples/debtags/pkgwalk     |    5 +++--
 lib/debian/changelog.py      |    6 ++++--
 lib/debian/deb822.py         |    9 +++++----
 lib/debian/debtags.py        |   22 ++++++++++++----------
 setup.py.in                  |    1 +
 tests/test_deb822.py         |    8 +++++---
 9 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/debian/control b/debian/control
index 9132d4a..072e3cc 100644
--- a/debian/control
+++ b/debian/control
@@ -8,7 +8,7 @@ Uploaders: Adeodato Simó <dato@net.com.org.es>,
  Reinhard Tartler <siretart@tauware.de>,
  Stefano Zacchiroli <zack@debian.org>,
  John Wright <jsw@debian.org>
-Build-Depends: debhelper (>= 5.0.37.2), python (>= 2.6.6-3~), python-setuptools, python-chardet
+Build-Depends: debhelper (>= 5.0.37.2), python (>= 2.6.6-3~), python-setuptools, python-chardet, python-six
 Standards-Version: 3.8.4
 Vcs-Browser: http://git.debian.org/?p=pkg-python-debian/python-debian.git
 Vcs-Git: git://git.debian.org/git/pkg-python-debian/python-debian.git
@@ -16,7 +16,7 @@ X-Python-Version: >= 2.5
 
 Package: python-debian
 Architecture: all
-Depends: ${python:Depends}, ${misc:Depends}, python-chardet
+Depends: ${python:Depends}, ${misc:Depends}, python-chardet, python-six
 Recommends: python-apt
 Suggests: gpgv
 Provides: python-deb822
diff --git a/examples/deb822/render-dctrl b/examples/deb822/render-dctrl
index a599339..523ce73 100755
--- a/examples/deb822/render-dctrl
+++ b/examples/deb822/render-dctrl
@@ -27,6 +27,7 @@ import sys
 from debian import deb822
 from markdown import markdown
 from optparse import OptionParser
+import six
 
 options = None		# global, for cmdline options
 
@@ -155,7 +156,7 @@ def render_file(f):
     for pkg in deb822.Packages.iter_paragraphs(f):
         print('<div class="package">')
         print('<dl class="fields">')
-        for (field, val) in pkg.iteritems():
+        for (field, val) in six.iteritems(pkg):
             render_field(field, val)
         print('</dl>')
         print('</div>\n')
diff --git a/examples/debfile/dpkg-info b/examples/debfile/dpkg-info
index ee23970..7321022 100755
--- a/examples/debfile/dpkg-info
+++ b/examples/debfile/dpkg-info
@@ -16,6 +16,8 @@ import os
 import stat
 import sys
 
+import six
+
 from debian import debfile
 
 if __name__ == '__main__':
@@ -42,7 +44,7 @@ if __name__ == '__main__':
             pass
         print('  %d bytes, %d lines, %s, %s' % (len(content), len(lines),
                 fname, ftype))
-    for n, v in deb.debcontrol().iteritems(): # print DEBIAN/control fields
+    for n, v in six.iteritems(deb.debcontrol()): # print DEBIAN/control fields
         if n.lower() == 'description':  # increase indentation of long dsc
             lines = v.split('\n')
             short_dsc = lines[0]
diff --git a/examples/debtags/pkgwalk b/examples/debtags/pkgwalk
index b6cb8bf..0a145e0 100755
--- a/examples/debtags/pkgwalk
+++ b/examples/debtags/pkgwalk
@@ -11,11 +11,12 @@
 
 import sys
 
-# Requires python-extractor, python-magic and python-debtags
+# Requires python-extractor, python-magic, python-debtags and python-six
 from debian import debtags
 import re
 from optparse import OptionParser
 import apt
+import six
 
 
 VERSION="0.1"
@@ -73,7 +74,7 @@ if __name__ == '__main__':
 		# Divide every tag score by the number of packages in the trail,
 		# obtaining a 'tag weight'.  A package can be later scored by summing
 		# the weight of all its tags.
-		for tag in tagscores.iterkeys():
+		for tag in six.iterkeys(tagscores):
 			tagscores[tag] = float(tagscores[tag]) / float(len(trail))
 
 		# Find the merged tagset of the packages in trail
diff --git a/lib/debian/changelog.py b/lib/debian/changelog.py
index df2ccdf..2bf6cc4 100644
--- a/lib/debian/changelog.py
+++ b/lib/debian/changelog.py
@@ -29,6 +29,8 @@ import re
 import socket
 import warnings
 
+import six
+
 from debian import debian_support
 
 # Python 3 doesn't have StandardError, but let's avoid changing our
@@ -101,7 +103,7 @@ class ChangeBlock(object):
 
     def other_keys_normalised(self):
         norm_dict = {}
-        for (key, value) in other_pairs.iteritems():
+        for (key, value) in six.iteritems(self.other_pairs):
             key = key[0].upper() + key[1:].lower()
             m = xbcs_re.match(key)
             if m is None:
@@ -150,7 +152,7 @@ class ChangeBlock(object):
         if self.urgency is None:
             raise ChangelogCreateError("Urgency not specified")
         block += "urgency=" + self.urgency + self.urgency_comment
-        for (key, value) in self.other_pairs.iteritems():
+        for (key, value) in six.iteritems(self.other_pairs):
             block += ", %s=%s" % (key, value)
         block += '\n'
         if self.changes() is None:
diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index c66c49f..ad715cd 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -54,6 +54,7 @@ except ImportError:
     _mapping_mixin = DictMixin
     _mutable_mapping_mixin = DictMixin
 
+import six
 
 GPGV_DEFAULT_KEYRINGS = frozenset(['/usr/share/keyrings/debian-keyring.gpg'])
 GPGV_EXECUTABLE = '/usr/bin/gpgv'
@@ -186,7 +187,7 @@ class Deb822Dict(_mutable_mapping_mixin, object):
         if _parsed is not None:
             self.__parsed = _parsed
             if _fields is None:
-                self.__keys.extend([ _strI(k) for k in self.__parsed.iterkeys() ])
+                self.__keys.extend([ _strI(k) for k in six.iterkeys(self.__parsed) ])
             else:
                 self.__keys.extend([ _strI(f) for f in _fields if f in self.__parsed ])
         
@@ -257,8 +258,8 @@ class Deb822Dict(_mutable_mapping_mixin, object):
         return '{%s}' % ', '.join(['%r: %r' % (k, v) for k, v in self.items()])
 
     def __eq__(self, other):
-        mykeys = sorted(self.iterkeys())
-        otherkeys = sorted(other.iterkeys())
+        mykeys = sorted(six.iterkeys(self))
+        otherkeys = sorted(six.iterkeys(other))
         if not mykeys == otherkeys:
             return False
 
@@ -459,7 +460,7 @@ class Deb822(Deb822Dict):
             # was explicitly specified
             encoding = self.encoding
 
-        for key in self.iterkeys():
+        for key in six.iterkeys(self):
             value = self.get_as_string(key)
             if not value or value[0] == '\n':
                 # Avoid trailing whitespace after "Field:" if it's on its own
diff --git a/lib/debian/debtags.py b/lib/debian/debtags.py
index f13bead..627228d 100644
--- a/lib/debian/debtags.py
+++ b/lib/debian/debtags.py
@@ -21,6 +21,8 @@ try:
 except ImportError:
     import pickle
 
+import six
+
 from debian.deprecation import function_deprecated_by
 
 def parse_tags(input):
@@ -263,7 +265,7 @@ class DB:
 		"""
 		res = DB()
 		db = {}
-		for pkg in filter(package_filter, self.db.iterkeys()):
+		for pkg in filter(package_filter, six.iterkeys(self.db)):
 			db[pkg] = self.db[pkg]
 		res.db = db
 		res.rdb = reverse(db)
@@ -279,7 +281,7 @@ class DB:
 		"""
 		res = DB()
 		db = {}
-		for pkg in filter(filter, self.db.iterkeys()):
+		for pkg in filter(filter, six.iterkeys(self.db)):
 			db[pkg] = self.db[pkg].copy()
 		res.db = db
 		res.rdb = reverse(db)
@@ -295,7 +297,7 @@ class DB:
 		"""
 		res = DB()
 		db = {}
-		for pkg, tags in filter(package_tag_filter, self.db.iteritems()):
+		for pkg, tags in filter(package_tag_filter, six.iteritems(self.db)):
 			db[pkg] = self.db[pkg]
 		res.db = db
 		res.rdb = reverse(db)
@@ -311,7 +313,7 @@ class DB:
 		"""
 		res = DB()
 		db = {}
-		for pkg, tags in filter(package_tag_filter, self.db.iteritems()):
+		for pkg, tags in filter(package_tag_filter, six.iteritems(self.db)):
 			db[pkg] = self.db[pkg].copy()
 		res.db = db
 		res.rdb = reverse(db)
@@ -327,7 +329,7 @@ class DB:
 		"""
 		res = DB()
 		rdb = {}
-		for tag in filter(tag_filter, self.rdb.iterkeys()):
+		for tag in filter(tag_filter, six.iterkeys(self.rdb)):
 			rdb[tag] = self.rdb[tag]
 		res.rdb = rdb
 		res.db = reverse(rdb)
@@ -343,7 +345,7 @@ class DB:
 		"""
 		res = DB()
 		rdb = {}
-		for tag in filter(tag_filter, self.rdb.iterkeys()):
+		for tag in filter(tag_filter, six.iterkeys(self.rdb)):
 			rdb[tag] = self.rdb[tag].copy()
 		res.rdb = rdb
 		res.db = reverse(rdb)
@@ -420,25 +422,25 @@ class DB:
 
 	def iter_packages(self):
 		"""Iterate over the packages"""
-		return self.db.iterkeys()
+		return six.iterkeys(self.db)
 
 	iterPackages = function_deprecated_by(iter_packages)
 
 	def iter_tags(self):
 		"""Iterate over the tags"""
-		return self.rdb.iterkeys()
+		return six.iterkeys(self.rdb)
 
 	iterTags = function_deprecated_by(iter_tags)
 
 	def iter_packages_tags(self):
 		"""Iterate over 2-tuples of (pkg, tags)"""
-		return self.db.iteritems()
+		return six.iteritems(self.db)
 
 	iterPackagesTags = function_deprecated_by(iter_packages_tags)
 
 	def iter_tags_packages(self):
 		"""Iterate over 2-tuples of (tag, pkgs)"""
-		return self.rdb.iteritems()
+		return six.iteritems(self.rdb)
 
 	iterTagsPackages = function_deprecated_by(iter_tags_packages)
 
diff --git a/setup.py.in b/setup.py.in
index 0bf6475..67423a3 100644
--- a/setup.py.in
+++ b/setup.py.in
@@ -27,4 +27,5 @@ setup(name='python-debian',
       py_modules=['deb822'],
       maintainer='Debian python-debian Maintainers',
       maintainer_email='pkg-python-debian-maint@lists.alioth.debian.org',
+      install_requires=['six'],
      )
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index da90bea..eae7418 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -28,6 +28,8 @@ try:
 except ImportError:
     from io import StringIO
 
+import six
+
 sys.path.insert(0, '../lib/')
 
 from debian import deb822
@@ -308,9 +310,9 @@ class TestDeb822Dict(unittest.TestCase):
 
         keys = ['TestKey', 'another_key', 'Third_key']
 
-        self.assertEqual(keys, d.keys())
-        self.assertEqual(keys, list(d.iterkeys()))
-        self.assertEqual(zip(keys, d.values()), d.items())
+        self.assertEqual(keys, list(d.keys()))
+        self.assertEqual(keys, list(six.iterkeys(d)))
+        self.assertEqual(list(zip(keys, d.values())), list(d.items()))
 
         keys2 = []
         for key in d:
-- 
1.7.8.3

>From 316712414048b791d6dc72146126726067f2e314 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:26:30 +0000
Subject: [PATCH 19/31] Use six to paper over int/long differences between
 Python 2 and 3.

---
 lib/debian/arfile.py  |    6 +++++-
 tests/test_debfile.py |    6 +++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/lib/debian/arfile.py b/lib/debian/arfile.py
index a9b132a..2861633 100644
--- a/lib/debian/arfile.py
+++ b/lib/debian/arfile.py
@@ -15,6 +15,10 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
+import six
+if six.PY3:
+    long = int
+
 GLOBAL_HEADER = "!<arch>\n"
 GLOBAL_HEADER_LENGTH = len(GLOBAL_HEADER)
 
@@ -280,7 +284,7 @@ class ArMember(object):
         cur = self.__fp.tell()
         
         if cur < self.__offset:
-            return 0L
+            return long(0)
         else:
             return cur - self.__offset
 
diff --git a/tests/test_debfile.py b/tests/test_debfile.py
index c677275..f4b01ad 100755
--- a/tests/test_debfile.py
+++ b/tests/test_debfile.py
@@ -25,6 +25,10 @@ import sys
 import tempfile
 import uu
 
+import six
+if six.PY3:
+    long = int
+
 sys.path.insert(0, '../lib/')
 
 from debian import arfile
@@ -69,7 +73,7 @@ class TestArFile(unittest.TestCase):
             self.assertEqual(m.tell(), i, "failed tell()")
             
             m.seek(-i, 1)
-            self.assertEqual(m.tell(), 0L, "failed tell()")
+            self.assertEqual(m.tell(), long(0), "failed tell()")
 
         m.seek(0)
         self.assertRaises(IOError, m.seek, -1, 0)
-- 
1.7.8.3

>From 578bf8edb5fb7cd113d3f7465866cbb9f0bafc22 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:30:41 +0000
Subject: [PATCH 20/31] Cope with the absence of a file class in Python 3.

---
 lib/debian/deb822.py |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index ad715cd..0f1a102 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -55,6 +55,20 @@ except ImportError:
     _mutable_mapping_mixin = DictMixin
 
 import six
+if six.PY3:
+    import io
+    def _is_real_file(f):
+        if not isinstance(f, io.IOBase):
+            return False
+        try:
+            f.fileno()
+            return True
+        except (AttributeError, io.UnsupportedOperation):
+            return False
+else:
+    def _is_real_file(f):
+        return isinstance(f, file) and hasattr(f, 'fileno')
+
 
 GPGV_DEFAULT_KEYRINGS = frozenset(['/usr/share/keyrings/debian-keyring.gpg'])
 GPGV_EXECUTABLE = '/usr/bin/gpgv'
@@ -334,7 +348,7 @@ class Deb822(Deb822Dict):
             necessary in order to properly interpret the strings.)
         """
 
-        if _have_apt_pkg and use_apt_pkg and isinstance(sequence, file):
+        if _have_apt_pkg and use_apt_pkg and _is_real_file(sequence):
             parser = apt_pkg.TagFile(sequence)
             for section in parser:
                 paragraph = cls(fields=fields,
-- 
1.7.8.3

>From 7fe694b2534987125ca251d3ae631d039b1d28b8 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:32:36 +0000
Subject: [PATCH 21/31] Python 3 renamed raw_input to input.

---
 examples/debtags/pkgwalk |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/examples/debtags/pkgwalk b/examples/debtags/pkgwalk
index 0a145e0..7da9a93 100755
--- a/examples/debtags/pkgwalk
+++ b/examples/debtags/pkgwalk
@@ -104,7 +104,10 @@ if __name__ == '__main__':
 
 		# Ask the user to choose a new package
 		while True:
-			ans = raw_input("> ").strip()
+			if six.PY3:
+				ans = input("> ").strip()
+			else:
+				ans = raw_input("> ").strip()
 			if ans[0] == 'q':
 				done = True
 				break
-- 
1.7.8.3

>From 003cbd259086e3fcc1ea07449a47ee8eb6438581 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:44:09 +0000
Subject: [PATCH 22/31] Be much more careful about closing files in a timely
 fashion, avoiding ResourceWarnings with Python 3.2.

---
 lib/debian/debfile.py |    7 ++++++
 tests/test_deb822.py  |   50 +++++++++++++++++++++++++++++++-----------------
 tests/test_debfile.py |   11 ++++++---
 tests/test_debtags.py |    3 +-
 4 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/lib/debian/debfile.py b/lib/debian/debfile.py
index 02ab368..8fac890 100644
--- a/lib/debian/debfile.py
+++ b/lib/debian/debfile.py
@@ -139,6 +139,9 @@ class DebPart(object):
     def __getitem__(self, fname):
         return self.get_content(fname)
 
+    def close(self):
+        self.__member.close()
+
 
 class DebData(DebPart):
 
@@ -273,6 +276,10 @@ class DebFile(ArFile):
                 return Changelog(raw_changelog)
         return None
 
+    def close(self):
+        self.control.close()
+        self.data.close()
+
 
 if __name__ == '__main__':
     import sys
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index eae7418..d44477e 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -692,11 +692,15 @@ Description: python modules to work with Debian-related data formats
         objects = []
         objects.append(deb822.Deb822(UNPARSED_PACKAGE))
         objects.append(deb822.Deb822(CHANGES_FILE))
-        objects.extend(deb822.Deb822.iter_paragraphs(open('test_Packages')))
-        objects.extend(deb822.Packages.iter_paragraphs(open('test_Packages')))
-        objects.extend(deb822.Deb822.iter_paragraphs(open('test_Sources')))
-        objects.extend(deb822.Deb822.iter_paragraphs(
-                         open('test_Sources.iso8859-1'), encoding="iso8859-1"))
+        with open('test_Packages') as f:
+            objects.extend(deb822.Deb822.iter_paragraphs(f))
+        with open('test_Packages') as f:
+            objects.extend(deb822.Packages.iter_paragraphs(f))
+        with open('test_Sources') as f:
+            objects.extend(deb822.Deb822.iter_paragraphs(f))
+        with open('test_Sources.iso8859-1') as f:
+            objects.extend(deb822.Deb822.iter_paragraphs(
+                f, encoding="iso8859-1"))
         for d in objects:
             for value in d.values():
                 self.assertTrue(isinstance(value, unicode))
@@ -707,17 +711,19 @@ Description: python modules to work with Debian-related data formats
         multi.append(deb822.Changes(CHANGES_FILE))
         multi.append(deb822.Changes(SIGNED_CHECKSUM_CHANGES_FILE
                                     % CHECKSUM_CHANGES_FILE))
-        multi.extend(deb822.Sources.iter_paragraphs(open('test_Sources')))
+        with open('test_Sources') as f:
+            multi.extend(deb822.Sources.iter_paragraphs(f))
         for d in multi:
             for key, value in d.items():
                 if key.lower() not in d.__class__._multivalued_fields:
                     self.assertTrue(isinstance(value, unicode))
 
     def test_encoding_integrity(self):
-        utf8 = list(deb822.Deb822.iter_paragraphs(open('test_Sources')))
-        latin1 = list(deb822.Deb822.iter_paragraphs(
-                                                open('test_Sources.iso8859-1'),
-                                                encoding='iso8859-1'))
+        with open('test_Sources') as f:
+            utf8 = list(deb822.Deb822.iter_paragraphs(f))
+        with open('test_Sources.iso8859-1') as f:
+            latin1 = list(deb822.Deb822.iter_paragraphs(
+                f, encoding='iso8859-1'))
 
         # dump() with no fd returns a unicode object - both should be identical
         self.assertEqual(len(utf8), len(latin1))
@@ -726,10 +732,10 @@ Description: python modules to work with Debian-related data formats
 
         # XXX: The way multiline fields parsing works, we can't guarantee
         # that trailing whitespace is reproduced.
-        utf8_contents = "\n".join([line.rstrip() for line in
-                                   open('test_Sources')] + [''])
-        latin1_contents = "\n".join([line.rstrip() for line in
-                                     open('test_Sources.iso8859-1')] + [''])
+        with open('test_Sources') as f:
+            utf8_contents = "\n".join([line.rstrip() for line in f] + [''])
+        with open('test_Sources.iso8859-1') as f:
+            latin1_contents = "\n".join([line.rstrip() for line in f] + [''])
 
         utf8_to_latin1 = StringIO()
         for d in utf8:
@@ -757,8 +763,10 @@ Description: python modules to work with Debian-related data formats
         warnings.filterwarnings(action='ignore', category=UnicodeWarning)
 
         filename = 'test_Sources.mixed_encoding'
-        for paragraphs in [deb822.Sources.iter_paragraphs(open(filename)),
-                           deb822.Sources.iter_paragraphs(open(filename),
+        f1 = open(filename, 'rb')
+        f2 = open(filename, 'rb')
+        for paragraphs in [deb822.Sources.iter_paragraphs(f1),
+                           deb822.Sources.iter_paragraphs(f2,
                                                           use_apt_pkg=False)]:
             p1 = paragraphs.next()
             self.assertEqual(p1['maintainer'],
@@ -766,6 +774,8 @@ Description: python modules to work with Debian-related data formats
             p2 = paragraphs.next()
             self.assertEqual(p2['uploaders'],
                              u'Frank Küster <frank@debian.org>')
+        f2.close()
+        f1.close()
 
     def test_bug597249_colon_as_first_value_character(self):
         """Colon should be allowed as the first value character. See #597249.
@@ -822,7 +832,8 @@ Description: python modules to work with Debian-related data formats
 class TestPkgRelations(unittest.TestCase):
 
     def test_packages(self):
-        pkgs = deb822.Packages.iter_paragraphs(open('test_Packages'))
+        f = open('test_Packages')
+        pkgs = deb822.Packages.iter_paragraphs(f)
         pkg1 = pkgs.next()
         rel1 = {'breaks': [],
                 'conflicts': [],
@@ -882,6 +893,7 @@ class TestPkgRelations(unittest.TestCase):
             [{'arch': None, 'name': 'kwifimanager', 'version': ('>=', '4:3.5.9-2')}],
             [{'arch': None, 'name': 'librss1', 'version': ('>=', '4:3.5.9-2')}]]
         self.assertEqual(dep3, pkg3.relations['depends'])
+        f.close()
 
         bin_rels = ['file, libc6 (>= 2.7-1), libpaper1, psutils']
         src_rels = ['apache2-src (>= 2.2.9), libaprutil1-dev, ' \
@@ -897,7 +909,8 @@ class TestPkgRelations(unittest.TestCase):
                             src_rel)))
 
     def test_sources(self):
-        pkgs = deb822.Sources.iter_paragraphs(open('test_Sources'))
+        f = open('test_Sources')
+        pkgs = deb822.Sources.iter_paragraphs(f)
         pkg1 = pkgs.next()
         rel1 = {'build-conflicts': [],
                 'build-conflicts-indep': [],
@@ -936,6 +949,7 @@ class TestPkgRelations(unittest.TestCase):
                     [{'name': 'binutils-doc', 'version': None, 'arch': None}],
                     [{'name': 'binutils-source', 'version': None, 'arch': None}]]}
         self.assertEqual(rel2, pkg2.relations)
+        f.close()
 
 
 class TestGpgInfo(unittest.TestCase):
diff --git a/tests/test_debfile.py b/tests/test_debfile.py
index f4b01ad..2ed9cae 100755
--- a/tests/test_debfile.py
+++ b/tests/test_debfile.py
@@ -39,8 +39,8 @@ class TestArFile(unittest.TestCase):
     def setUp(self):
         os.system("ar r test.ar test_debfile.py test_changelog test_deb822.py >/dev/null 2>&1") 
         assert os.path.exists("test.ar")
-        self.testmembers = [ x.strip()
-                for x in os.popen("ar t test.ar").readlines() ]
+        with os.popen("ar t test.ar") as ar:
+            self.testmembers = [x.strip() for x in ar.readlines()]
         self.a = arfile.ArFile("test.ar")
 
     def tearDown(self):
@@ -79,6 +79,7 @@ class TestArFile(unittest.TestCase):
         self.assertRaises(IOError, m.seek, -1, 0)
         self.assertRaises(IOError, m.seek, -1, 1)
         m.seek(0)
+        m.close()
     
     def test_file_read(self):
         """ test for faked read """
@@ -128,6 +129,7 @@ class TestDebFile(unittest.TestCase):
         self.d = debfile.DebFile(self.debname)
 
     def tearDown(self):
+        self.d.close()
         os.unlink(self.debname)
         os.unlink(self.broken_debname)
         os.unlink(self.bz2_debname)
@@ -142,6 +144,7 @@ class TestDebFile(unittest.TestCase):
         # can access its content
         self.assertEqual(os.path.normpath(bz2_deb.data.tgz().getnames()[10]),
                          os.path.normpath('./usr/share/locale/bg/'))
+        bz2_deb.close()
 
     def test_data_names(self):
         """ test for file list equality """ 
@@ -156,8 +159,8 @@ class TestDebFile(unittest.TestCase):
 
     def test_control(self):
         """ test for control equality """
-        filecontrol = "".join(os.popen("dpkg-deb -f %s" %
-            self.debname).readlines())
+        with os.popen("dpkg-deb -f %s" % self.debname) as dpkg_deb:
+            filecontrol = "".join(dpkg_deb.readlines())
 
         self.assertEqual(self.d.control.get_content("control"), filecontrol)
 
diff --git a/tests/test_debtags.py b/tests/test_debtags.py
index 27de759..252d8d0 100755
--- a/tests/test_debtags.py
+++ b/tests/test_debtags.py
@@ -26,7 +26,8 @@ from debian import debtags
 class TestDebtags(unittest.TestCase):
     def mkdb(self):
         db = debtags.DB()
-        db.read(open("test_tagdb", "r"))
+        with open("test_tagdb", "r") as f:
+            db.read(f)
         return db
 
     def test_insert(self):
-- 
1.7.8.3

>From 3c86ac581770f2fe6f218c425312a32e9a93ade2 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:57:37 +0000
Subject: [PATCH 23/31] Use six to paper over iterator.next() vs.
 next(iterator) differences between Python 2 and 3.

---
 lib/debian/deb822.py |    2 +-
 tests/test_deb822.py |   16 ++++++++--------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index 0f1a102..d183622 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -796,7 +796,7 @@ class GpgInfo(dict):
         # Peek at the first line to see if it's newline-terminated.
         sequence_iter = iter(sequence)
         try:
-            first_line = sequence_iter.next()
+            first_line = six.advance_iterator(sequence_iter)
         except StopIteration:
             return ""
         join_str = '\n'
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index d44477e..53804c5 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -488,7 +488,7 @@ class TestDeb822(unittest.TestCase):
 
     def test_iter_paragraphs_empty_input(self):
         generator = deb822.Deb822.iter_paragraphs([])
-        self.assertRaises(StopIteration, generator.next)
+        self.assertRaises(StopIteration, six.advance_iterator, generator)
 
     def test_parser_limit_fields(self):
         wanted_fields = [ 'Package', 'MD5sum', 'Filename', 'Description' ]
@@ -768,10 +768,10 @@ Description: python modules to work with Debian-related data formats
         for paragraphs in [deb822.Sources.iter_paragraphs(f1),
                            deb822.Sources.iter_paragraphs(f2,
                                                           use_apt_pkg=False)]:
-            p1 = paragraphs.next()
+            p1 = six.advance_iterator(paragraphs)
             self.assertEqual(p1['maintainer'],
                              u'Adeodato Simó <dato@net.com.org.es>')
-            p2 = paragraphs.next()
+            p2 = six.advance_iterator(paragraphs)
             self.assertEqual(p2['uploaders'],
                              u'Frank Küster <frank@debian.org>')
         f2.close()
@@ -834,7 +834,7 @@ class TestPkgRelations(unittest.TestCase):
     def test_packages(self):
         f = open('test_Packages')
         pkgs = deb822.Packages.iter_paragraphs(f)
-        pkg1 = pkgs.next()
+        pkg1 = six.advance_iterator(pkgs)
         rel1 = {'breaks': [],
                 'conflicts': [],
                 'depends': [[{'name': 'file', 'version': None, 'arch': None}],
@@ -860,7 +860,7 @@ class TestPkgRelations(unittest.TestCase):
                     [{'name': 't1-cyrillic', 'version': None, 'arch': None}],
                     [{'name': 'texlive-base-bin', 'version': None, 'arch': None}]]}
         self.assertEqual(rel1, pkg1.relations)
-        pkg2 = pkgs.next()
+        pkg2 = six.advance_iterator(pkgs)
         rel2 = {'breaks': [],
                 'conflicts': [],
                 'depends': [[{'name': 'lrzsz', 'version': None, 'arch': None}],
@@ -877,7 +877,7 @@ class TestPkgRelations(unittest.TestCase):
                 'replaces': [],
                 'suggests': []}
         self.assertEqual(rel2, pkg2.relations)
-        pkg3 = pkgs.next()
+        pkg3 = six.advance_iterator(pkgs)
         dep3 = [[{'arch': None, 'name': 'dcoprss', 'version': ('>=', '4:3.5.9-2')}],
             [{'arch': None, 'name': 'kdenetwork-kfile-plugins', 'version': ('>=', '4:3.5.9-2')}],
             [{'arch': None, 'name': 'kdict', 'version': ('>=', '4:3.5.9-2')}],
@@ -911,7 +911,7 @@ class TestPkgRelations(unittest.TestCase):
     def test_sources(self):
         f = open('test_Sources')
         pkgs = deb822.Sources.iter_paragraphs(f)
-        pkg1 = pkgs.next()
+        pkg1 = six.advance_iterator(pkgs)
         rel1 = {'build-conflicts': [],
                 'build-conflicts-indep': [],
                 'build-depends': [[{'name': 'apache2-src', 'version': ('>=', '2.2.9'), 'arch': None}],
@@ -924,7 +924,7 @@ class TestPkgRelations(unittest.TestCase):
                 'build-depends-indep': [],
                 'binary': [[{'name': 'apache2-mpm-itk', 'version': None, 'arch': None}]]}
         self.assertEqual(rel1, pkg1.relations)
-        pkg2 = pkgs.next()
+        pkg2 = six.advance_iterator(pkgs)
         rel2 = {'build-conflicts': [],
                 'build-conflicts-indep': [],
                 'build-depends': [[{'name': 'dpkg-dev', 'version': ('>=', '1.13.9'), 'arch': None}],
-- 
1.7.8.3

>From 53908952c9b5f7969473384cfd900e45d49b1c1b Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 00:59:13 +0000
Subject: [PATCH 24/31] Use string.ascii_letters rather than the deprecated
 string.letters.

---
 tests/test_deb822.py |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 53804c5..98e8a48 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -346,7 +346,7 @@ class TestDeb822(unittest.TestCase):
     def gen_random_string(length=20):
         from random import choice
         import string
-        chars = string.letters + string.digits
+        chars = string.ascii_letters + string.digits
         return ''.join([choice(chars) for i in range(length)])
     gen_random_string = staticmethod(gen_random_string)
 
-- 
1.7.8.3

>From 0d3af2ffa23a77d99fc67bb311cc6f60e5cc3488 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 01:05:17 +0000
Subject: [PATCH 25/31] In Python 3, encode Unicode strings before passing
 them to hashlib.sha1().

---
 lib/debian/debian_support.py |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py
index a51dbd8..2146a21 100644
--- a/lib/debian/debian_support.py
+++ b/lib/debian/debian_support.py
@@ -408,7 +408,10 @@ del list_releases
 def read_lines_sha1(lines):
     m = hashlib.sha1()
     for l in lines:
-        m.update(l)
+        if isinstance(l, bytes):
+            m.update(l)
+        else:
+            m.update(l.encode("UTF-8"))
     return m.hexdigest()
 
 readLinesSHA1 = function_deprecated_by(read_lines_sha1)
-- 
1.7.8.3

>From df7aba663141264e1f21f14dfe2bceddc09a50e1 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 01:08:23 +0000
Subject: [PATCH 26/31] Fix up debian.changelog for string handling changes in
 Python 3.

---
 lib/debian/changelog.py |   44 +++++++++++++++++++++++++++++++++-----------
 tests/test_changelog.py |   37 ++++++++++++++++++++-----------------
 2 files changed, 53 insertions(+), 28 deletions(-)

diff --git a/lib/debian/changelog.py b/lib/debian/changelog.py
index 2bf6cc4..ce80c6a 100644
--- a/lib/debian/changelog.py
+++ b/lib/debian/changelog.py
@@ -137,7 +137,7 @@ class ChangeBlock(object):
                 changes.append(change)
             self._changes = changes
 
-    def __unicode__(self):
+    def _format(self):
         # TODO(jsw): Switch to StringIO or a list to join at the end.
         block = ""
         if self.package is None:
@@ -170,8 +170,18 @@ class ChangeBlock(object):
             block += line + "\n"
         return block
 
-    def __str__(self):
-        return unicode(self).encode(self._encoding)
+    if six.PY3:
+        def __str__(self):
+            return self._format()
+
+        def __bytes__(self):
+            return str(self).encode(self._encoding)
+    else:
+        def __unicode__(self):
+            return self._format()
+
+        def __str__(self):
+            return unicode(self).encode(self._encoding)
 
 topline = re.compile(r'^(\w%(name_chars)s*) \(([^\(\) \t]+)\)'
                      '((\s+%(name_chars)s+)+)\;'
@@ -268,7 +278,9 @@ class Changelog(object):
         
         state = first_heading
         old_state = None
-        if isinstance(file, basestring):
+        if isinstance(file, bytes):
+            file = file.decode(encoding)
+        if isinstance(file, six.string_types):
             # Make sure the changelog file is not empty.
             if len(file.strip()) == 0:
                 self._parse_error('Empty changelog file.', strict)
@@ -276,7 +288,7 @@ class Changelog(object):
 
             file = file.splitlines()
         for line in file:
-            if not isinstance(line, unicode):
+            if not isinstance(line, six.text_type):
                 line = line.decode(encoding)
             # Support both lists of lines without the trailing newline and
             # those with trailing newlines (e.g. when given a file object
@@ -468,15 +480,25 @@ class Changelog(object):
     def _raw_versions(self):
         return [block._raw_version for block in self._blocks]
 
-    def __unicode__(self):
+    def _format(self):
         pieces = []
-        pieces.append(u'\n'.join(self.initial_blank_lines))
+        pieces.append(six.u('\n').join(self.initial_blank_lines))
         for block in self._blocks:
-            pieces.append(unicode(block))
-        return u''.join(pieces)
+            pieces.append(six.text_type(block))
+        return six.u('').join(pieces)
 
-    def __str__(self):
-        return unicode(self).encode(self._encoding)
+    if six.PY3:
+        def __str__(self):
+            return self._format()
+
+        def __bytes__(self):
+            return str(self).encode(self._encoding)
+    else:
+        def __unicode__(self):
+            return self._format()
+
+        def __str__(self):
+            return unicode(self).encode(self._encoding)
 
     def __iter__(self):
         return iter(self._blocks)
diff --git a/tests/test_changelog.py b/tests/test_changelog.py
index 361b0ac..b48ffb2 100755
--- a/tests/test_changelog.py
+++ b/tests/test_changelog.py
@@ -27,6 +27,8 @@
 import sys
 import unittest
 
+import six
+
 sys.path.insert(0, '../lib/')
 
 from debian import changelog
@@ -187,42 +189,43 @@ class ChangelogTests(unittest.TestCase):
         f = open('test_changelog_unicode')
         c = changelog.Changelog(f)
         f.close()
-        u = unicode(c)
-        expected_u = u"""haskell-src-exts (1.8.2-3) unstable; urgency=low
+        u = six.text_type(c)
+        expected_u = six.u("""haskell-src-exts (1.8.2-3) unstable; urgency=low
 
   * control: Use versioned Replaces: and Conflicts:
 
- -- Marco Túlio Gontijo e Silva <marcot@debian.org>  Wed, 05 May 2010 18:01:53 -0300
+ -- Marco T\xfalio Gontijo e Silva <marcot@debian.org>  Wed, 05 May 2010 18:01:53 -0300
 
 haskell-src-exts (1.8.2-2) unstable; urgency=low
 
   * debian/control: Rename -doc package.
 
- -- Marco Túlio Gontijo e Silva <marcot@debian.org>  Tue, 16 Mar 2010 10:59:48 -0300
-"""
+ -- Marco T\xfalio Gontijo e Silva <marcot@debian.org>  Tue, 16 Mar 2010 10:59:48 -0300
+""")
         self.assertEqual(u, expected_u)
-        self.assertEqual(str(c), u.encode('utf-8'))
+        self.assertEqual(bytes(c), u.encode('utf-8'))
 
     def test_unicode_object_input(self):
-        f = open('test_changelog_unicode')
-        c_str = f.read()
+        f = open('test_changelog_unicode', 'rb')
+        c_bytes = f.read()
         f.close()
-        c_unicode = c_str.decode('utf-8')
+        c_unicode = c_bytes.decode('utf-8')
         c = changelog.Changelog(c_unicode)
-        self.assertEqual(unicode(c), c_unicode)
-        self.assertEqual(str(c), c_str)
+        self.assertEqual(six.text_type(c), c_unicode)
+        self.assertEqual(bytes(c), c_bytes)
 
     def test_non_utf8_encoding(self):
-        f = open('test_changelog_unicode')
-        c_str = f.read()
+        f = open('test_changelog_unicode', 'rb')
+        c_bytes = f.read()
         f.close()
-        c_unicode = c_str.decode('utf-8')
+        c_unicode = c_bytes.decode('utf-8')
         c_latin1_str = c_unicode.encode('latin1')
         c = changelog.Changelog(c_latin1_str, encoding='latin1')
-        self.assertEqual(unicode(c), c_unicode)
-        self.assertEqual(str(c), c_latin1_str)
+        self.assertEqual(six.text_type(c), c_unicode)
+        self.assertEqual(bytes(c), c_latin1_str)
         for block in c:
-            self.assertEqual(str(block), unicode(block).encode('latin1'))
+            self.assertEqual(bytes(block),
+                             six.text_type(block).encode('latin1'))
 
     def test_block_iterator(self):
         f = open('test_changelog')
-- 
1.7.8.3

>From 465decf5ccc1ac2fc7629a38a328f9e769888a47 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 01:11:02 +0000
Subject: [PATCH 27/31] Only define DebPart.has_key method for Python 2.

---
 lib/debian/debfile.py |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/debian/debfile.py b/lib/debian/debfile.py
index 8fac890..68188ba 100644
--- a/lib/debian/debfile.py
+++ b/lib/debian/debfile.py
@@ -18,6 +18,8 @@
 import gzip
 import tarfile
 
+import six
+
 from debian.arfile import ArFile, ArError
 from debian.changelog import Changelog
 from debian.deb822 import Deb822
@@ -133,8 +135,9 @@ class DebPart(object):
     def __contains__(self, fname):
         return self.has_file(fname)
 
-    def has_key(self, fname):
-        return self.has_file(fname)
+    if not six.PY3:
+        def has_key(self, fname):
+            return self.has_file(fname)
 
     def __getitem__(self, fname):
         return self.get_content(fname)
-- 
1.7.8.3

>From 665813dc01c60d75f86015c760cbffcaab32d406 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 01:14:39 +0000
Subject: [PATCH 28/31] Fix up debian.arfile and debian.debfile for string
 handling changes in Python 3, involving adding
 encoding= and errors= parameters in a number of
 places.  Loosely inspired by tarfile.

---
 lib/debian/arfile.py  |   61 ++++++++++++++++++++++++++++++++++++------------
 lib/debian/debfile.py |   62 +++++++++++++++++++++++++++++++++++++------------
 tests/test_debfile.py |   27 ++++++++++++++-------
 3 files changed, 111 insertions(+), 39 deletions(-)

diff --git a/lib/debian/arfile.py b/lib/debian/arfile.py
index 2861633..5d5ce4e 100644
--- a/lib/debian/arfile.py
+++ b/lib/debian/arfile.py
@@ -15,15 +15,17 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
+import sys
+
 import six
 if six.PY3:
     long = int
 
-GLOBAL_HEADER = "!<arch>\n"
+GLOBAL_HEADER = b"!<arch>\n"
 GLOBAL_HEADER_LENGTH = len(GLOBAL_HEADER)
 
 FILE_HEADER_LENGTH = 60
-FILE_MAGIC = "`\n"
+FILE_MAGIC = b"`\n"
 
 class ArError(Exception):
     pass
@@ -38,14 +40,30 @@ class ArFile(object):
         - members       same as getmembers()
     """
 
-    def __init__(self, filename=None, mode='r', fileobj=None):
+    def __init__(self, filename=None, mode='r', fileobj=None,
+                 encoding=None, errors=None):
         """ Build an ar file representation starting from either a filename or
-        an existing file object. The only supported mode is 'r' """
+        an existing file object. The only supported mode is 'r'.
+
+        In Python 3, the encoding and errors parameters control how member
+        names are decoded into Unicode strings. Like tarfile, the default
+        encoding is sys.getfilesystemencoding() and the default error handling
+        scheme is 'surrogateescape' (>= 3.2) or 'strict' (< 3.2).
+        """
 
         self.__members = [] 
         self.__members_dict = {}
         self.__fname = filename
         self.__fileobj = fileobj
+        if encoding is None:
+            encoding = sys.getfilesystemencoding()
+        self.__encoding = encoding
+        if errors is None:
+            if sys.version >= '3.2':
+                errors = 'surrogateescape'
+            else:
+                errors = 'strict'
+        self.__errors = errors
         
         if mode == "r":
             self.__index_archive()
@@ -63,7 +81,9 @@ class ArFile(object):
             raise ArError("Unable to find global header")
 
         while True:
-            newmember = ArMember.from_file(fp, self.__fname)
+            newmember = ArMember.from_file(fp, self.__fname,
+                                           encoding=self.__encoding,
+                                           errors=self.__errors)
             if not newmember:
                 break
             self.__members.append(newmember)
@@ -164,7 +184,7 @@ class ArMember(object):
         self.__offset = None    # start-of-data offset
         self.__end = None       # end-of-data offset
 
-    def from_file(fp, fname):
+    def from_file(fp, fname, encoding=None, errors=None):
         """fp is an open File object positioned on a valid file header inside
         an ar archive. Return a new ArMember on success, None otherwise. """
 
@@ -180,6 +200,15 @@ class ArMember(object):
         if buf[58:60] != FILE_MAGIC:
             raise IOError("Incorrect file magic")
 
+        if six.PY3:
+            if encoding is None:
+                encoding = sys.getfilesystemencoding()
+            if errors is None:
+                if sys.version >= '3.2':
+                    errors = 'surrogateescape'
+                else:
+                    errors = 'strict'
+
         # http://en.wikipedia.org/wiki/Ar_(Unix)    
         #from   to     Name                      Format
         #0      15     File name                 ASCII
@@ -192,7 +221,9 @@ class ArMember(object):
 
         # XXX struct.unpack can be used as well here
         f = ArMember()
-        f.__name = buf[0:16].split("/")[0].strip()
+        f.__name = buf[0:16].split(b"/")[0].strip()
+        if six.PY3:
+            f.__name = f.__name.decode(encoding, errors)
         f.__mtime = int(buf[16:28])
         f.__owner = int(buf[28:34])
         f.__group = int(buf[34:40])
@@ -212,7 +243,7 @@ class ArMember(object):
     # XXX this is not a sequence like file objects
     def read(self, size=0):
         if self.__fp is None:
-            self.__fp = open(self.__fname, "r")
+            self.__fp = open(self.__fname, "rb")
             self.__fp.seek(self.__offset)
 
         cur = self.__fp.tell()
@@ -221,31 +252,31 @@ class ArMember(object):
             return self.__fp.read(size)
 
         if cur >= self.__end or cur < self.__offset:
-            return ''
+            return b''
 
         return self.__fp.read(self.__end - cur)
 
     def readline(self, size=None):
         if self.__fp is None:
-            self.__fp = open(self.__fname, "r")
+            self.__fp = open(self.__fname, "rb")
             self.__fp.seek(self.__offset)
 
         if size is not None: 
             buf = self.__fp.readline(size)
             if self.__fp.tell() > self.__end:
-                return ''
+                return b''
 
             return buf
 
         buf = self.__fp.readline()
         if self.__fp.tell() > self.__end:
-            return ''
+            return b''
         else:
             return buf
 
     def readlines(self, sizehint=0):
         if self.__fp is None:
-            self.__fp = open(self.__fname, "r")
+            self.__fp = open(self.__fname, "rb")
             self.__fp.seek(self.__offset)
         
         buf = None
@@ -260,7 +291,7 @@ class ArMember(object):
 
     def seek(self, offset, whence=0):
         if self.__fp is None:
-            self.__fp = open(self.__fname, "r")
+            self.__fp = open(self.__fname, "rb")
             self.__fp.seek(self.__offset)
 
         if self.__fp.tell() < self.__offset:
@@ -278,7 +309,7 @@ class ArMember(object):
 
     def tell(self):
         if self.__fp is None:
-            self.__fp = open(self.__fname, "r")
+            self.__fp = open(self.__fname, "rb")
             self.__fp.seek(self.__offset)
 
         cur = self.__fp.tell()
diff --git a/lib/debian/debfile.py b/lib/debian/debfile.py
index 68188ba..5c38afc 100644
--- a/lib/debian/debfile.py
+++ b/lib/debian/debfile.py
@@ -107,20 +107,42 @@ class DebPart(object):
         return (('./' + fname in names) \
                 or (fname in names)) # XXX python << 2.5 TarFile compatibility
 
-    def get_file(self, fname):
-        """Return a file object corresponding to a given file name."""
+    def get_file(self, fname, encoding=None, errors=None):
+        """Return a file object corresponding to a given file name.
+
+        If encoding is given, then the file object will return Unicode data;
+        otherwise, it will return binary data.
+        """
 
         fname = DebPart.__normalize_member(fname)
         try:
-            return (self.tgz().extractfile('./' + fname))
+            fobj = self.tgz().extractfile('./' + fname)
         except KeyError:    # XXX python << 2.5 TarFile compatibility
-            return (self.tgz().extractfile(fname))
-
-    def get_content(self, fname):
+            fobj = self.tgz().extractfile(fname)
+        if encoding is not None:
+            if six.PY3:
+                import io
+                if not hasattr(fobj, 'flush'):
+                    # XXX http://bugs.python.org/issue13815
+                    fobj.flush = lambda: None
+                return io.TextIOWrapper(fobj, encoding=encoding, errors=errors)
+            else:
+                import codecs
+                if errors is None:
+                    errors = 'strict'
+                return codecs.EncodedFile(fobj, encoding, errors=errors)
+        else:
+            return fobj
+
+    def get_content(self, fname, encoding=None, errors=None):
         """Return the string content of a given file, or None (e.g. for
-        directories)."""
+        directories).
 
-        f = self.get_file(fname)
+        If encoding is given, then the content will be a Unicode object;
+        otherwise, it will contain binary data.
+        """
+
+        f = self.get_file(fname, encoding=encoding, errors=errors)
         content = None
         if f:   # can be None for non regular or link files
             content = f.read()
@@ -173,24 +195,34 @@ class DebControl(DebPart):
 
         return Deb822(self.get_content(CONTROL_FILE))
 
-    def md5sums(self):
+    def md5sums(self, encoding=None, errors=None):
         """ Return a dictionary mapping filenames (of the data part) to
         md5sums. Fails if the control part does not contain a 'md5sum' file.
 
         Keys of the returned dictionary are the left-hand side values of lines
         in the md5sums member of control.tar.gz, usually file names relative to
-        the file system root (without heading '/' or './'). """
+        the file system root (without heading '/' or './').
+
+        The returned keys are Unicode objects if an encoding is specified,
+        otherwise binary. The returned values are always Unicode."""
 
         if not self.has_file(MD5_FILE):
             raise DebError("'%s' file not found, can't list MD5 sums" %
                     MD5_FILE)
 
-        md5_file = self.get_file(MD5_FILE)
+        md5_file = self.get_file(MD5_FILE, encoding=encoding, errors=errors)
         sums = {}
+        if encoding is None:
+            newline = b'\r\n'
+        else:
+            newline = '\r\n'
         for line in md5_file.readlines():
             # we need to support spaces in filenames, .split() is not enough
-            md5, fname = line.rstrip('\r\n').split(None, 1)
-            sums[fname] = md5
+            md5, fname = line.rstrip(newline).split(None, 1)
+            if six.PY3 and isinstance(md5, bytes):
+                sums[fname] = md5.decode()
+            else:
+                sums[fname] = md5
         md5_file.close()
         return sums
 
@@ -259,9 +291,9 @@ class DebFile(ArFile):
         """ See .control.scripts() """
         return self.control.scripts()
 
-    def md5sums(self):
+    def md5sums(self, encoding=None, errors=None):
         """ See .control.md5sums() """
-        return self.control.md5sums()
+        return self.control.md5sums(encoding=encoding, errors=errors)
 
     def changelog(self):
         """ Return a Changelog object for the changelog.Debian.gz of the
diff --git a/tests/test_debfile.py b/tests/test_debfile.py
index 2ed9cae..21a6367 100755
--- a/tests/test_debfile.py
+++ b/tests/test_debfile.py
@@ -84,7 +84,7 @@ class TestArFile(unittest.TestCase):
     def test_file_read(self):
         """ test for faked read """
         for m in self.a.getmembers():
-            f = open(m.name)
+            f = open(m.name, 'rb')
         
             for i in [10, 100, 10000]:
                 self.assertEqual(m.read(i), f.read(i))
@@ -96,7 +96,7 @@ class TestArFile(unittest.TestCase):
         """ test for faked readlines """
 
         for m in self.a.getmembers():
-            f = open(m.name)
+            f = open(m.name, 'rb')
         
             self.assertEqual(m.readlines(), f.readlines())
             
@@ -107,8 +107,8 @@ class TestDebFile(unittest.TestCase):
 
     def setUp(self):
         def uudecode(infile, outfile):
-            uu_deb = open(infile, 'r')
-            bin_deb = open(outfile, 'w')
+            uu_deb = open(infile, 'rb')
+            bin_deb = open(outfile, 'wb')
             uu.decode(uu_deb, bin_deb)
             uu_deb.close()
             bin_deb.close()
@@ -121,8 +121,8 @@ class TestDebFile(unittest.TestCase):
         uudecode('test-bz2.deb.uu', self.bz2_debname)
 
         self.debname = 'test.deb'
-        uu_deb = open('test.deb.uu', 'r')
-        bin_deb = open(self.debname, 'w')
+        uu_deb = open('test.deb.uu', 'rb')
+        bin_deb = open(self.debname, 'wb')
         uu.decode(uu_deb, bin_deb)
         uu_deb.close()
         bin_deb.close()
@@ -162,14 +162,23 @@ class TestDebFile(unittest.TestCase):
         with os.popen("dpkg-deb -f %s" % self.debname) as dpkg_deb:
             filecontrol = "".join(dpkg_deb.readlines())
 
-        self.assertEqual(self.d.control.get_content("control"), filecontrol)
+        self.assertEqual(
+            self.d.control.get_content("control").decode("utf-8"), filecontrol)
+        self.assertEqual(
+            self.d.control.get_content("control", encoding="utf-8"),
+            filecontrol)
 
     def test_md5sums(self):
         """test md5 extraction from .debs"""
         md5 = self.d.md5sums()
-        self.assertEqual(md5['usr/bin/hello'],
+        self.assertEqual(md5[b'usr/bin/hello'],
                 '9c1a72a78f82216a0305b6c90ab71058')
-        self.assertEqual(md5['usr/share/locale/zh_TW/LC_MESSAGES/hello.mo'],
+        self.assertEqual(md5[b'usr/share/locale/zh_TW/LC_MESSAGES/hello.mo'],
+                'a7356e05bd420872d03cd3f5369de42f')
+        md5 = self.d.md5sums(encoding='UTF-8')
+        self.assertEqual(md5[six.u('usr/bin/hello')],
+                '9c1a72a78f82216a0305b6c90ab71058')
+        self.assertEqual(md5[six.u('usr/share/locale/zh_TW/LC_MESSAGES/hello.mo')],
                 'a7356e05bd420872d03cd3f5369de42f')
 
 if __name__ == '__main__':
-- 
1.7.8.3

>From 3771296e1f9856abe42c70fd16f135b5305a5fb1 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sat, 21 Jan 2012 01:34:43 +0000
Subject: [PATCH 29/31] Fix up most of debian.deb822 for string handling
 changes in Python 3.  There are still a couple of
 difficult cases left.

---
 lib/debian/deb822.py |   30 +++++++++++++++++++-----------
 tests/test_deb822.py |   43 +++++++++++++++++++++++--------------------
 2 files changed, 42 insertions(+), 31 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index d183622..cf96c40 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -229,7 +229,7 @@ class Deb822Dict(_mutable_mapping_mixin, object):
             else:
                 raise
 
-        if isinstance(value, str):
+        if isinstance(value, bytes):
             # Always return unicode objects instead of strings
             try:
                 value = value.decode(self.encoding)
@@ -393,7 +393,7 @@ class Deb822(Deb822Dict):
 
         wanted_field = lambda f: fields is None or f in fields
 
-        if isinstance(sequence, basestring):
+        if isinstance(sequence, six.string_types):
             sequence = sequence.splitlines()
 
         curkey = None
@@ -441,6 +441,10 @@ class Deb822(Deb822Dict):
     def __unicode__(self):
         return self.dump()
 
+    if six.PY3:
+        def __bytes__(self):
+            return self.dump().encode(self.encoding)
+
     # __repr__ is handled by Deb822Dict
 
     def get_as_string(self, key):
@@ -450,7 +454,7 @@ class Deb822(Deb822Dict):
         this can be overridden in subclasses (e.g. _multivalued) that can take
         special values.
         """
-        return unicode(self[key])
+        return six.text_type(self[key])
 
     def dump(self, fd=None, encoding=None):
         """Dump the the contents in the original format
@@ -721,9 +725,9 @@ class GpgInfo(dict):
 
         n = cls()
 
-        if isinstance(out, basestring):
+        if isinstance(out, six.string_types):
             out = out.split('\n')
-        if isinstance(err, basestring):
+        if isinstance(err, six.string_types):
             err = err.split('\n')
 
         n.out = out
@@ -776,13 +780,17 @@ class GpgInfo(dict):
             raise IOError("cannot access any of the given keyrings")
 
         p = subprocess.Popen(args, stdin=subprocess.PIPE,
-                             stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+                             stdout=subprocess.PIPE, stderr=subprocess.PIPE,
+                             universal_newlines=True)
         # XXX what to do with exit code?
 
-        if isinstance(sequence, basestring):
-            (out, err) = p.communicate(sequence)
+        if isinstance(sequence, six.string_types):
+            inp = sequence
         else:
-            (out, err) = p.communicate(cls._get_full_string(sequence))
+            inp = cls._get_full_string(sequence)
+        if six.PY3:
+            inp = inp.encode('UTF-8')
+        out, err = p.communicate(inp)
 
         return cls.from_output(out, err)
 
@@ -1048,7 +1056,7 @@ class _multivalued(Deb822):
                 field_lengths = {}
             for item in array:
                 for x in order:
-                    raw_value = unicode(item[x])
+                    raw_value = six.text_type(item[x])
                     try:
                         length = field_lengths[keyl][x]
                     except KeyError:
@@ -1084,7 +1092,7 @@ class _gpg_multivalued(_multivalued):
             sequence = kwargs.get("sequence", None)
 
         if sequence is not None:
-            if isinstance(sequence, basestring):
+            if isinstance(sequence, six.string_types):
                 self.raw_text = sequence
             elif hasattr(sequence, "items"):
                 # sequence is actually a dict(-like) object, so we don't have
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index 98e8a48..b65ff7b 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -25,8 +25,9 @@ import unittest
 import warnings
 try:
     from StringIO import StringIO
+    BytesIO = StringIO
 except ImportError:
-    from io import StringIO
+    from io import BytesIO, StringIO
 
 import six
 
@@ -144,7 +145,7 @@ CcYAoOLYDF5G1h3oR1iDNyeCI6hRW03S
     ]
 
 
-CHANGES_FILE = u'''\
+CHANGES_FILE = six.u('''\
 Format: 1.7
 Date: Fri, 28 Dec 2007 17:08:48 +0100
 Source: bzr-gtk
@@ -169,7 +170,7 @@ Files:
  0fd797f4138a9d4fdeb8c30597d46bc9 1003 python optional bzr-gtk_0.93.0-2.dsc
  d9523676ae75c4ced299689456f252f4 3860 python optional bzr-gtk_0.93.0-2.diff.gz
  8960459940314b21019dedd5519b47a5 168544 python optional bzr-gtk_0.93.0-2_all.deb
-'''
+''')
 
 CHECKSUM_CHANGES_FILE = '''\
 Format: 1.8
@@ -328,7 +329,7 @@ class TestDeb822Dict(unittest.TestCase):
 
     def test_unicode_key_access(self):
         d = self.make_dict()
-        self.assertEqual(1, d[u'testkey'])
+        self.assertEqual(1, d[six.u('testkey')])
 
 
 class TestDeb822(unittest.TestCase):
@@ -445,22 +446,24 @@ class TestDeb822(unittest.TestCase):
         packages_content = "\n".join([line.rstrip() for line in
                                       packages_content.splitlines()] + [''])
 
-        s = StringIO()
+        if six.PY3:
+            packages_content = packages_content.encode("UTF-8")
+        s = BytesIO()
         l = []
         f = open(filename)
         for p in cls.iter_paragraphs(f, **kwargs):
             p.dump(s)
-            s.write("\n")
+            s.write(b"\n")
             l.append(p)
         f.close()
         self.assertEqual(s.getvalue(), packages_content)
         if kwargs["shared_storage"] is False:
             # If shared_storage is False, data should be consistent across
             # iterations -- i.e. we can use "old" objects
-            s = StringIO()
+            s = BytesIO()
             for p in l:
                 p.dump(s)
-                s.write("\n")
+                s.write(b"\n")
             self.assertEqual(s.getvalue(), packages_content)
 
     def test_iter_paragraphs_apt_shared_storage_packages(self):
@@ -703,7 +706,7 @@ Description: python modules to work with Debian-related data formats
                 f, encoding="iso8859-1"))
         for d in objects:
             for value in d.values():
-                self.assertTrue(isinstance(value, unicode))
+                self.assertTrue(isinstance(value, six.text_type))
 
         # The same should be true for Sources and Changes except for their
         # _multivalued fields
@@ -716,7 +719,7 @@ Description: python modules to work with Debian-related data formats
         for d in multi:
             for key, value in d.items():
                 if key.lower() not in d.__class__._multivalued_fields:
-                    self.assertTrue(isinstance(value, unicode))
+                    self.assertTrue(isinstance(value, six.text_type))
 
     def test_encoding_integrity(self):
         with open('test_Sources') as f:
@@ -732,20 +735,20 @@ Description: python modules to work with Debian-related data formats
 
         # XXX: The way multiline fields parsing works, we can't guarantee
         # that trailing whitespace is reproduced.
-        with open('test_Sources') as f:
-            utf8_contents = "\n".join([line.rstrip() for line in f] + [''])
-        with open('test_Sources.iso8859-1') as f:
-            latin1_contents = "\n".join([line.rstrip() for line in f] + [''])
+        with open('test_Sources', 'rb') as f:
+            utf8_contents = b"\n".join([line.rstrip() for line in f] + [b''])
+        with open('test_Sources.iso8859-1', 'rb') as f:
+            latin1_contents = b"\n".join([line.rstrip() for line in f] + [b''])
 
-        utf8_to_latin1 = StringIO()
+        utf8_to_latin1 = BytesIO()
         for d in utf8:
             d.dump(fd=utf8_to_latin1, encoding='iso8859-1')
-            utf8_to_latin1.write("\n")
+            utf8_to_latin1.write(b"\n")
 
-        latin1_to_utf8 = StringIO()
+        latin1_to_utf8 = BytesIO()
         for d in latin1:
             d.dump(fd=latin1_to_utf8, encoding='utf-8')
-            latin1_to_utf8.write("\n")
+            latin1_to_utf8.write(b"\n")
 
         self.assertEqual(utf8_contents, latin1_to_utf8.getvalue())
         self.assertEqual(latin1_contents, utf8_to_latin1.getvalue())
@@ -770,10 +773,10 @@ Description: python modules to work with Debian-related data formats
                                                           use_apt_pkg=False)]:
             p1 = six.advance_iterator(paragraphs)
             self.assertEqual(p1['maintainer'],
-                             u'Adeodato Simó <dato@net.com.org.es>')
+                             six.u('Adeodato Sim\xf3 <dato@net.com.org.es>'))
             p2 = six.advance_iterator(paragraphs)
             self.assertEqual(p2['uploaders'],
-                             u'Frank Küster <frank@debian.org>')
+                             six.u('Frank K\xfcster <frank@debian.org>'))
         f2.close()
         f1.close()
 
-- 
1.7.8.3

>From 9e7622b89cf67235cc2e8888b3e81c8606f1701c Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Sun, 22 Jan 2012 14:13:17 +0000
Subject: [PATCH 30/31] Fix up the rest of debian.deb822 for Python 3 string
 handling.

This gets messy because there are some contexts where either bytes or
Unicode strings might be required:

 * GpgInfo requires byte strings so that it can check signatures without
   being broken by recoding.
 * In order to support mixed-encoding files, we have to ask
   apt_pkg.TagFile to just give us bytes so that we can do our own
   encoding detection without interference (requires a python-apt patch;
   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=656288).
 * However, file objects given to us might have been opened either in
   binary or text mode, so we might get either bytes or str when reading
   from them.

With this patch, the tests pass with the minimum amount of invasive work
I could manage.
---
 lib/debian/deb822.py |  156 +++++++++++++++++++++++++++++++-------------------
 tests/test_deb822.py |    9 ++-
 2 files changed, 102 insertions(+), 63 deletions(-)

diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py
index cf96c40..34d1f6d 100644
--- a/lib/debian/deb822.py
+++ b/lib/debian/deb822.py
@@ -43,8 +43,9 @@ import warnings
 
 try:
     from StringIO import StringIO
+    BytesIO = StringIO
 except ImportError:
-    from io import StringIO
+    from io import BytesIO, StringIO
 try:
     from collections import Mapping, MutableMapping
     _mapping_mixin = Mapping
@@ -101,11 +102,11 @@ class TagSectionWrapper(_mapping_mixin, object):
 
         # Get just the stuff after the first ':'
         # Could use s.partition if we only supported python >= 2.5
-        data = s[s.find(':')+1:]
+        data = s[s.find(b':')+1:]
 
         # Get rid of spaces and tabs after the ':', but not newlines, and strip
         # off any newline at the end of the data.
-        return data.lstrip(' \t').rstrip('\n')
+        return data.lstrip(b' \t').rstrip(b'\n')
 
 
 class OrderedSet(object):
@@ -204,7 +205,31 @@ class Deb822Dict(_mutable_mapping_mixin, object):
                 self.__keys.extend([ _strI(k) for k in six.iterkeys(self.__parsed) ])
             else:
                 self.__keys.extend([ _strI(f) for f in _fields if f in self.__parsed ])
-        
+
+    def _detect_encoding(self, value):
+        """If value is not already Unicode, decode it intelligently."""
+        if isinstance(value, bytes):
+            try:
+                return value.decode(self.encoding)
+            except UnicodeDecodeError as e:
+                # Evidently, the value wasn't encoded with the encoding the
+                # user specified.  Try detecting it.
+                warnings.warn('decoding from %s failed; attempting to detect '
+                              'the true encoding' % self.encoding,
+                              UnicodeWarning)
+                result = chardet.detect(value)
+                try:
+                    return value.decode(result['encoding'])
+                except UnicodeDecodeError:
+                    raise e
+                else:
+                    # Assume the rest of the paragraph is in this encoding as
+                    # well (there's no sense in repeating this exercise for
+                    # every field).
+                    self.encoding = result['encoding']
+        else:
+            return value
+
     ### BEGIN _mutable_mapping_mixin methods
 
     def __iter__(self):
@@ -229,28 +254,7 @@ class Deb822Dict(_mutable_mapping_mixin, object):
             else:
                 raise
 
-        if isinstance(value, bytes):
-            # Always return unicode objects instead of strings
-            try:
-                value = value.decode(self.encoding)
-            except UnicodeDecodeError as e:
-                # Evidently, the value wasn't encoded with the encoding the
-                # user specified.  Try detecting it.
-                warnings.warn('decoding from %s failed; attempting to detect '
-                              'the true encoding' % self.encoding,
-                              UnicodeWarning)
-                result = chardet.detect(value)
-                try:
-                    value = value.decode(result['encoding'])
-                except UnicodeDecodeError:
-                    raise e
-                else:
-                    # Assume the rest of the paragraph is in this encoding as
-                    # well (there's no sense in repeating this exercise for
-                    # every field).
-                    self.encoding = result['encoding']
-
-        return value
+        return self._detect_encoding(value)
 
     def __delitem__(self, key):
         key = _strI(key)
@@ -349,7 +353,15 @@ class Deb822(Deb822Dict):
         """
 
         if _have_apt_pkg and use_apt_pkg and _is_real_file(sequence):
-            parser = apt_pkg.TagFile(sequence)
+            kwargs = {}
+            if six.PY3:
+                # bytes=True is supported for both Python 2 and 3, but we
+                # only actually need it for Python 3, so this saves us from
+                # having to require a newer version of python-apt for Python
+                # 2 as well.  This allows us to apply our own encoding
+                # handling, which is more tolerant of mixed-encoding files.
+                kwargs['bytes'] = True
+            parser = apt_pkg.TagFile(sequence, **kwargs)
             for section in parser:
                 paragraph = cls(fields=fields,
                                 _parsed=TagSectionWrapper(section),
@@ -376,11 +388,24 @@ class Deb822(Deb822Dict):
         """
         at_beginning = True
         for line in sequence:
-            if line.startswith('#'):
-                continue
-            if at_beginning:
-                if not line.rstrip('\r\n'):
+            # The bytes/str polymorphism required here to support Python 3
+            # is unpleasant, but fortunately limited.  We need this because
+            # at this point we might have been given either bytes or
+            # Unicode, and we haven't yet got to the point where we can try
+            # to decode a whole paragraph and detect its encoding.
+            if isinstance(line, bytes):
+                if line.startswith(b'#'):
                     continue
+            else:
+                if line.startswith('#'):
+                    continue
+            if at_beginning:
+                if isinstance(line, bytes):
+                    if not line.rstrip(b'\r\n'):
+                        continue
+                else:
+                    if not line.rstrip('\r\n'):
+                        continue
                 at_beginning = False
             yield line
 
@@ -393,7 +418,7 @@ class Deb822(Deb822Dict):
 
         wanted_field = lambda f: fields is None or f in fields
 
-        if isinstance(sequence, six.string_types):
+        if isinstance(sequence, (six.string_types, bytes)):
             sequence = sequence.splitlines()
 
         curkey = None
@@ -401,6 +426,8 @@ class Deb822(Deb822Dict):
 
         for line in self.gpg_stripped_paragraph(
                 self._skip_useless_lines(sequence)):
+            line = self._detect_encoding(line)
+
             m = single.match(line)
             if m:
                 if curkey:
@@ -590,13 +617,20 @@ class Deb822(Deb822Dict):
         gpg_pre_lines = []
         lines = []
         gpg_post_lines = []
-        state = 'SAFE'
-        gpgre = re.compile(r'^-----(?P<action>BEGIN|END) PGP (?P<what>[^-]+)-----$')
-        blank_line = re.compile('^$')
+        state = b'SAFE'
+        gpgre = re.compile(br'^-----(?P<action>BEGIN|END) PGP (?P<what>[^-]+)-----$')
+        blank_line = re.compile(b'^$')
         first_line = True
 
         for line in sequence:
-            line = line.strip('\r\n')
+            # Some consumers of this method require bytes (encoding
+            # detection and signature checking).  However, we might have
+            # been given a file opened in text mode, in which case it's
+            # simplest to encode to bytes.
+            if six.PY3 and isinstance(line, str):
+                line = line.encode()
+
+            line = line.strip(b'\r\n')
 
             # skip initial blank lines, if any
             if first_line:
@@ -608,7 +642,7 @@ class Deb822(Deb822Dict):
             m = gpgre.match(line)
 
             if not m:
-                if state == 'SAFE':
+                if state == b'SAFE':
                     if not blank_line.match(line):
                         lines.append(line)
                     else:
@@ -616,17 +650,17 @@ class Deb822(Deb822Dict):
                             # There's no gpg signature, so we should stop at
                             # this blank line
                             break
-                elif state == 'SIGNED MESSAGE':
+                elif state == b'SIGNED MESSAGE':
                     if blank_line.match(line):
-                        state = 'SAFE'
+                        state = b'SAFE'
                     else:
                         gpg_pre_lines.append(line)
-                elif state == 'SIGNATURE':
+                elif state == b'SIGNATURE':
                     gpg_post_lines.append(line)
             else:
-                if m.group('action') == 'BEGIN':
+                if m.group('action') == b'BEGIN':
                     state = m.group('what')
-                elif m.group('action') == 'END':
+                elif m.group('action') == b'END':
                     gpg_post_lines.append(line)
                     break
                 if not blank_line.match(line):
@@ -757,7 +791,7 @@ class GpgInfo(dict):
     def from_sequence(cls, sequence, keyrings=None, executable=None):
         """Create a new GpgInfo object from the given sequence.
 
-        :param sequence: sequence of lines or a string
+        :param sequence: sequence of lines of bytes or a single byte string
 
         :param keyrings: list of keyrings to use (default:
             ['/usr/share/keyrings/debian-keyring.gpg'])
@@ -784,19 +818,17 @@ class GpgInfo(dict):
                              universal_newlines=True)
         # XXX what to do with exit code?
 
-        if isinstance(sequence, six.string_types):
+        if isinstance(sequence, bytes):
             inp = sequence
         else:
             inp = cls._get_full_string(sequence)
-        if six.PY3:
-            inp = inp.encode('UTF-8')
         out, err = p.communicate(inp)
 
         return cls.from_output(out, err)
 
     @staticmethod
     def _get_full_string(sequence):
-        """Return a string from a sequence of lines.
+        """Return a byte string from a sequence of lines of bytes.
 
         This method detects if the sequence's lines are newline-terminated, and
         constructs the string appropriately.
@@ -806,10 +838,10 @@ class GpgInfo(dict):
         try:
             first_line = six.advance_iterator(sequence_iter)
         except StopIteration:
-            return ""
-        join_str = '\n'
-        if first_line.endswith('\n'):
-            join_str = ''
+            return b""
+        join_str = b'\n'
+        if first_line.endswith(b'\n'):
+            join_str = b''
         return first_line + join_str + join_str.join(sequence_iter)
 
     @classmethod
@@ -818,7 +850,7 @@ class GpgInfo(dict):
 
         See GpgInfo.from_sequence.
         """
-        with open(target) as target_file:
+        with open(target, 'rb') as target_file:
             return cls.from_sequence(target_file, *args, **kwargs)
 
 
@@ -1092,8 +1124,14 @@ class _gpg_multivalued(_multivalued):
             sequence = kwargs.get("sequence", None)
 
         if sequence is not None:
-            if isinstance(sequence, six.string_types):
+            if isinstance(sequence, bytes):
                 self.raw_text = sequence
+            elif isinstance(sequence, six.string_types):
+                # If the file is really in some other encoding, then this
+                # probably won't verify correctly, but this is the best we
+                # can reasonably manage.  For accurate verification, the
+                # file should be opened in binary mode.
+                self.raw_text = sequence.encode('utf-8')
             elif hasattr(sequence, "items"):
                 # sequence is actually a dict(-like) object, so we don't have
                 # the raw text.
@@ -1106,12 +1144,12 @@ class _gpg_multivalued(_multivalued):
                     # Empty input
                     gpg_pre_lines = lines = gpg_post_lines = []
                 if gpg_pre_lines and gpg_post_lines:
-                    raw_text = StringIO()
-                    raw_text.write("\n".join(gpg_pre_lines))
-                    raw_text.write("\n\n")
-                    raw_text.write("\n".join(lines))
-                    raw_text.write("\n\n")
-                    raw_text.write("\n".join(gpg_post_lines))
+                    raw_text = BytesIO()
+                    raw_text.write(b"\n".join(gpg_pre_lines))
+                    raw_text.write(b"\n\n")
+                    raw_text.write(b"\n".join(lines))
+                    raw_text.write(b"\n\n")
+                    raw_text.write(b"\n".join(gpg_post_lines))
                     self.raw_text = raw_text.getvalue()
                 try:
                     args = list(args)
diff --git a/tests/test_deb822.py b/tests/test_deb822.py
index b65ff7b..258228f 100755
--- a/tests/test_deb822.py
+++ b/tests/test_deb822.py
@@ -701,7 +701,7 @@ Description: python modules to work with Debian-related data formats
             objects.extend(deb822.Packages.iter_paragraphs(f))
         with open('test_Sources') as f:
             objects.extend(deb822.Deb822.iter_paragraphs(f))
-        with open('test_Sources.iso8859-1') as f:
+        with open('test_Sources.iso8859-1', 'rb') as f:
             objects.extend(deb822.Deb822.iter_paragraphs(
                 f, encoding="iso8859-1"))
         for d in objects:
@@ -724,7 +724,7 @@ Description: python modules to work with Debian-related data formats
     def test_encoding_integrity(self):
         with open('test_Sources') as f:
             utf8 = list(deb822.Deb822.iter_paragraphs(f))
-        with open('test_Sources.iso8859-1') as f:
+        with open('test_Sources.iso8859-1', 'rb') as f:
             latin1 = list(deb822.Deb822.iter_paragraphs(
                 f, encoding='iso8859-1'))
 
@@ -966,6 +966,7 @@ class TestGpgInfo(unittest.TestCase):
             os.path.exists('/usr/share/keyrings/debian-keyring.gpg'))
 
         self.data = SIGNED_CHECKSUM_CHANGES_FILE % CHECKSUM_CHANGES_FILE
+        self.data = self.data.encode()
         self.valid = {
             'GOODSIG':
                 ['D14219877A786561', 'John Wright <john.wright@hp.com>'],
@@ -998,7 +999,7 @@ class TestGpgInfo(unittest.TestCase):
         if not self.should_run:
             return
 
-        sequence = StringIO(self.data)
+        sequence = BytesIO(self.data)
         gpg_info = deb822.GpgInfo.from_sequence(sequence)
         self._validate_gpg_info(gpg_info)
 
@@ -1015,7 +1016,7 @@ class TestGpgInfo(unittest.TestCase):
             return
 
         fd, filename = tempfile.mkstemp()
-        fp = os.fdopen(fd, 'w')
+        fp = os.fdopen(fd, 'wb')
         fp.write(self.data)
         fp.close()
 
-- 
1.7.8.3

>From 751a857bd58c5e51db2ed985902cce4598afd4a3 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@canonical.com>
Date: Fri, 20 Jan 2012 17:43:31 +0000
Subject: [PATCH 31/31] Add a python3-debian package.

---
 debian/control |   19 ++++++++++++++++++-
 debian/rules   |    8 ++++++++
 2 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/debian/control b/debian/control
index 072e3cc..c831038 100644
--- a/debian/control
+++ b/debian/control
@@ -8,7 +8,7 @@ Uploaders: Adeodato Simó <dato@net.com.org.es>,
  Reinhard Tartler <siretart@tauware.de>,
  Stefano Zacchiroli <zack@debian.org>,
  John Wright <jsw@debian.org>
-Build-Depends: debhelper (>= 5.0.37.2), python (>= 2.6.6-3~), python-setuptools, python-chardet, python-six
+Build-Depends: debhelper (>= 5.0.37.2), python (>= 2.6.6-3~), python3 (>= 3.1.2-8~), python-setuptools, python3-setuptools, python-chardet, python3-chardet, python-six, python3-six
 Standards-Version: 3.8.4
 Vcs-Browser: http://git.debian.org/?p=pkg-python-debian/python-debian.git
 Vcs-Git: git://git.debian.org/git/pkg-python-debian/python-debian.git
@@ -34,3 +34,20 @@ Description: Python modules to work with Debian-related data formats
   * Raw .deb and .ar files, with (read-only) access to contained
     files and meta-information
 
+Package: python3-debian
+Architecture: all
+Depends: ${python3:Depends}, ${misc:Depends}, python3-chardet, python3-six
+Recommends: python3-apt
+Suggests: gpgv
+Description: Python 3 modules to work with Debian-related data formats
+ This package provides Python 3 modules that abstract many formats of Debian
+ related files. Currently handled are:
+  * Debtags information (debian.debtags module)
+  * debian/changelog (debian.changelog module)
+  * Packages files, pdiffs (debian.debian_support module)
+  * Control files of single or multiple RFC822-style paragraphs, e.g.
+    debian/control, .changes, .dsc, Packages, Sources, Release, etc.
+    (debian.deb822 module)
+  * Raw .deb and .ar files, with (read-only) access to contained
+    files and meta-information
+
diff --git a/debian/rules b/debian/rules
index 6a11aeb..8769389 100755
--- a/debian/rules
+++ b/debian/rules
@@ -15,6 +15,7 @@ build-stamp: setup.py
 
 	# Add here commands to compile the package.
 	python setup.py build
+	python3 setup.py build
 	
 	# run the tests
 	cd tests && ./test_deb822.py
@@ -22,6 +23,11 @@ build-stamp: setup.py
 	cd tests && ./test_debtags.py
 	cd tests && ./test_changelog.py
 	cd tests && ./test_debian_support.py
+	cd tests && python3 ./test_deb822.py
+	cd tests && python3 ./test_debfile.py
+	cd tests && python3 ./test_debtags.py
+	cd tests && python3 ./test_changelog.py
+	cd tests && python3 ./test_debian_support.py
 
 	lib/debian/doc-debtags > README.debtags
 
@@ -34,6 +40,7 @@ clean: setup.py
 
 	# Add here commands to clean up after the build process.
 	python setup.py clean
+	python3 setup.py clean
 	rm -rf lib/python_debian.egg-info
 	rm -rf build/
 	rm -f README.debtags
@@ -50,6 +57,7 @@ install: build
 
 	# Add here commands to install the package into debian/tmp
 	python setup.py install --root="$(CURDIR)/debian/python-debian" --no-compile --install-layout=deb
+	python3 setup.py install --root="$(CURDIR)/debian/python3-debian" --no-compile --install-layout=deb
 
 
 # Build architecture-independent files here.
-- 
1.7.8.3

Reply to:

Follow-Ups:
- Bug#656288: Bug#625509: python-debian: please port to Py3k
  - From: John Wright <jsw@debian.org>

Prev by Date: Bug#656865: apt-get update crashes
Next by Date: Bug#567765: French translation update
Previous by thread: Bug#656865: apt-get update crashes
Next by thread: Bug#656288: Bug#625509: python-debian: please port to Py3k
Index(es):
- Date
- Thread