We seem to be moving to a de facto standard of UTF-8 for non-ASCII characters in debian/control files. This is not specified in Policy [1], but for hopefully obvious reasons, consistency is a Good Thing, and UTF-8 seems to be the best solution for this sort of thing. In my sid control files, I see 841 lines with non-ASCII characters, mostly (761 lines) in Maintainer and Uploaders fields: perl -ne 'print if m/[\x80-\xff]/' /var/lib/apt/lists/* | wc -l Of these, 747 lines are UTF-8 and 94 lines are not.[2] I hate to suggest a mass bug filing (33 source packages), since it's a mere de facto standard. And I'm certainly not in the mood to campaign for a Policy amendment. But it would be a Good Thing to aim for consistency here. Current UI tools (dpkg, dselect, apt-cache, aptitude) seem to know nothing about character sets, and just pass characters verbatim to the terminal, but one can easily imagine a tool that would convert to a user's local character set when possible. I suggest that the affected source packages[3] be run through the command 'iconv -f ORIGINAL_CHARSET -t utf-8' as soon as convenient. Would people support a mass bug at minor severity? Peter [1] Note that UTF-8 *is* recommended for debian/changelog. http://www.debian.org/doc/debian-policy/ap-pkg-sourcepkg.html#s-pkg-dpkgchangelog [2] It is easy to tell if text is UTF-8 or not; I use the exit status of "iconv -f utf-8 -t utf-8". This gives very few false positives, because UTF-8 has a very strict format. [3] abcm2ps freecraft maint-guide ap-utils gl-117 movixmaker-2 appunti-informatica-libera glade-perl mozilla-locale-hu ayuda gnustep-icons myspell-sv boa gridlock ntfsdoc boa-constructor gtkdiskfree pdftohtml bombermaze gtodo pdp bonsai iris pyca cadubi itcl3 pyro cantus kernel-patch-2.4.26-s390 pythoncad coq-doc kernel-patch-2.4.27-s390 rat crafted krb4 strategoxt darkstat lg-issue46 sympa ddclient libcgi-validate-perl syslog-ng doc-linux-html-pt libconfig-general-perl tuxeyes doc-linux-text-pt libexporter-lite-perl unac drpython libtext-unaccent-perl wmblob elmo libuniversal-exports-perl wmnetmon fcmp linux-ntfs wordtrans fortunes-fr linux-tutorial-es wprint fortunes-it
Attachment:
signature.asc
Description: Digital signature