Hello, On 15/05/2020 21:10, Andreas Tille wrote:> Would you mind providing a patch with chardet? There is a patch attached to this e-mail. I used [1] for the base file. I don't think the patch is great (because there are two 'open()' calls) but it has minimal modifications of the current source code. I think it's a better solution for the success the migration to python3 (because it avoid introducing bugs during the migration). Feel free to ask for more explanations or other stuff if you need. 1: https://salsa.debian.org/qa/udd/-/blob/master/udd/ddtp_gatherer.py -- Stéphane
--- ddtp_gatherer.py.orig 2020-05-17 22:54:21.793075000 +0200 +++ ddtp_gatherer.py 2020-05-18 13:02:47.210764004 +0200 @@ -25,6 +25,8 @@ import logging import logging.handlers +import chardet + debug=0 def get_gatherer(connection, config, source): @@ -117,7 +119,7 @@ trfile = trfilepath + file # check whether hash recorded in index file fits real file try: - f = open(trfile) + f = _open_file(trfile) except IOError, err: self.log.error("%s: %s.", str(err), trfile) continue @@ -236,6 +238,13 @@ except IOError, err: self.log.exception("Error reading %s%s", dir, filename) +def _open_file(path): + with open(path, 'rb') as f: + raw_content = f.read() + encoding = chardet.detect(raw_content)["encoding"] + return open(path, encoding=encoding) + + if __name__ == '__main__': main()
Attachment:
signature.asc
Description: OpenPGP digital signature