[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Started porting UDD to Python3 (Was: [UDD] Is there some effort to port UDD to Python3?)



Hi Stéphane,

thanks for your patch which I applied in the python3 branch.  Unfortunately
it does not solve the issue:


udd(python3) $ ./update-and-run.sh ddtp
Traceback (most recent call last):
  File "/srv/udd.debian.org/udd//udd.py", line 88, in <module>
    exec("gatherer.%s()" % command)
  File "<string>", line 1, in <module>
  File "/srv/udd.debian.org/udd/udd/ddtp_gatherer.py", line 127, in run
    h.update(f.read())
  File "/usr/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 11: invalid continuation byte


Thanks a lot anyway

      Andreas.

On Mon, May 18, 2020 at 01:15:11PM +0200, Stéphane Blondon wrote:
> Hello,
> 
> On 15/05/2020 21:10, Andreas Tille wrote:> Would you mind providing a
> patch with chardet?
> There is a patch attached to this e-mail.
> 
> I used [1] for the base file. I don't think the patch is great (because
> there are two 'open()' calls) but it has minimal modifications of the
> current source code. I think it's a better solution for the success the
> migration to python3 (because it avoid introducing bugs during the
> migration).
> 
> 
> Feel free to ask for more explanations or other stuff if you need.
> 
> 1: https://salsa.debian.org/qa/udd/-/blob/master/udd/ddtp_gatherer.py
> 
> -- 
> Stéphane

> --- ddtp_gatherer.py.orig	2020-05-17 22:54:21.793075000 +0200
> +++ ddtp_gatherer.py	2020-05-18 13:02:47.210764004 +0200
> @@ -25,6 +25,8 @@
>  import logging
>  import logging.handlers
>  
> +import chardet
> +
>  debug=0
>  
>  def get_gatherer(connection, config, source):
> @@ -117,7 +119,7 @@
>            trfile = trfilepath + file
>            # check whether hash recorded in index file fits real file
>            try:
> -            f = open(trfile)
> +            f = _open_file(trfile)
>            except IOError, err:
>              self.log.error("%s: %s.", str(err), trfile)
>              continue
> @@ -236,6 +238,13 @@
>          except IOError, err:
>            self.log.exception("Error reading %s%s", dir, filename)
>  
> +def _open_file(path):
> +    with open(path, 'rb') as f:
> +        raw_content = f.read()
> +        encoding = chardet.detect(raw_content)["encoding"]
> +    return open(path, encoding=encoding)
> +
> +
>  if __name__ == '__main__':
>    main()
>  





-- 
http://fam-tille.de


Reply to: