[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#656288: python3-apt: difficulties with non-UTF-8-encoded TagFiles



On Wed, Jan 18, 2012 at 12:56:03AM +0000, Colin Watson wrote:
> python-debian's test suite also tests that it's possible to parse old
> Sources files in *mixed* encodings.  This is going to be harder because
> it basically means having apt_pkg.TagSection return bytes, which I don't
> think is desirable in general.  Maybe this could be optional somehow?

Thinking about it, this seems a reasonable thing to make switchable in
TagFile's constructor.  After all:

  >>> with open("test", encoding="iso-8859-1") as test:
  ...     print(test.read().__class__)
  ...
  <class 'str'>
  >>> with open("test", mode="rb") as test:
  ...     print(test.read().__class__)
  ...
  <class 'bytes'>

So there's clear precedent in the language for the same method returning
str or bytes depending on how the class was constructed.  Maybe a bytes=
keyword argument?

-- 
Colin Watson                                       [cjwatson@debian.org]



Reply to: