Re: charsets in debian/control
Josselin Mouette <joss@debian.org> writes:
> Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit :
>> I think most of us agree that non-UTF-8-characters are not a good idea
>> (please note the UTF-8-characters is a superset of ASCII). For some
>> places (like package names), I think most of us even agree that only
>> ASCII-characters should be used. Also, there is the proposal that in
>> other fields (i.e. names), an translation should (also) be used if the
>> characters are not in some basic classes (more or less: ASCII plus
>> ASCII-similar letters).
>>
>> So, I personally consider non-UTF-8-characters an bug, and
>> UTF-8-not-ASCII on the way from bug to allowed.
>
> Many of us have names that can't be written using ASCII. Furthermore,
> the Debian tools need consistency between the developer name in the
> changelog and the Maintainer/Uploaders fields in the control file. The
> only way for these developers to have a policy-compliant changelog
> without having their uploads considered as NMUs is to encode the control
> file in UTF-8.
> --
> .''`. Josselin Mouette /\./\
> : :' : josselin.mouette@ens-lyon.org
> `. `' joss@debian.org
> `- Debian GNU/Linux -- The power of freedom
Which means all control file, changelog file, changes file, Packages
and Sources file parsing programs have to be truely converted to
UTF-8.
dpkg, apt, aptitude, dselect, apt-proxy, apt-cacher(?), debmirror,
debpartial-mirror, DAK, cdebootstrap, ... I guess most just work out
of luck with the mixture we have now.
We already had cdebootstrap crashes because of it (its parser was a
bit stricter than the rest).
On that note, how likely is it to hit a UTF-8 character encoding that
contains a '\n'? Any non UTF-8 aware parser would assume a new line
has started and get parse errors.
MfG
Goswin
Reply to: