Re: charsets in debian/control

On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote:
> > Would Peter permit me a mild dissent?  I prefer Latin-1.  Reason: I can
> > recognize and distinguish Latin-1 characters, even when I do not always
> > understand the words they spell.  Recognizing and distinguishing the
> > characters is important to me.  And not just to me.  Imagine the dismay
> > of a Korean user trying to read Arabic script in a control file.
>  But the only field in UTF8 should be Maintainer, and that field should
> have (IMHO) also a roman transliterate for the name, if you don't use a
> latin charset (Greek, Arabic, Japanese, Chinese...)

  Well, when aptitude gets UTF8 support, it'll decode all the control fields 
that are mainly meant for human consumption: that means at least Description 
in addition to the Maintainer field, and maybe also Section.

  I don't see any reason to limit ourselves in the long term by sticking to 
Latin1 (or ASCII) just because none of us can read all of the languages that 
are available in the extended UTF8 namespace.  If we want people to stick to 
certain subsets of UTF8, that should be determined in Policy, not the 
packaging software.

  If you want a practical concern (aside from, say, a general suspicion of 
building policy into software tools), consider these cases:

  -> Someone wants to translate the Description fields of all packages in 
Debian into Chinese or Arabic.  What will they do if the package tools only 
support Latin-1?

  -> Someone wants to use the Debian packaging tools to create a new 
distribution for use in China.  Again, what will they do if the package tools 
only support Latin-1?


