On Mon, May 18, 2020 at 08:35:33PM +0200, Stéphane Blondon wrote: > > Can you send me the file 'gatherer.${I_dont_know_the_command}' which > raises the UnicodeDecodeError exception? I will try to write a working > patch. I simply added a debug line: udd(python3) $ git diff diff --git a/udd/ddtp_gatherer.py b/udd/ddtp_gatherer.py index bbf041b..d32b85f 100644 --- a/udd/ddtp_gatherer.py +++ b/udd/ddtp_gatherer.py @@ -239,6 +239,7 @@ class ddtp_gatherer(gatherer): self.log.exception("Error reading %s%s", dir, filename) def _open_file(path): + print(path) with open(path, 'rb') as f: raw_content = f.read() encoding = chardet.detect(raw_content)["encoding"] which leads to udd(python3) $ ./update-and-run.sh ddtp /srv/mirrors/debian/dists/squeeze-proposed-updates/main/i18n/Translation-en.bz2 /srv/mirrors/debian/dists/squeeze-proposed-updates/non-free/i18n/Translation-en.bz2 /srv/mirrors/debian/dists/squeeze-proposed-updates/contrib/i18n/Translation-en.bz2 /srv/mirrors/debian/dists/stretch-proposed-updates/main/i18n/Translation-en.bz2 Traceback (most recent call last): File "/srv/udd.debian.org/udd//udd.py", line 88, in <module> exec("gatherer.%s()" % command) File "<string>", line 1, in <module> File "/srv/udd.debian.org/udd/udd/ddtp_gatherer.py", line 127, in run h.update(f.read()) File "/usr/lib/python3.8/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 11: invalid continuation byte While you can download the files from any Debian mirror I've attached /srv/mirrors/debian/dists/stretch-proposed-updates/main/i18n/Translation-en.bz2 to this mail. My guess is that translations from stretch will not be touched any more and thus we need to cope somehow with the existing encoding. Thanks a lot for your help Andreas. -- http://fam-tille.de
Attachment:
Translation-en.bz2
Description: Binary data