On Thu, May 31, 2001 at 11:30:07PM -0400, Joey Hess wrote:
> Cesar Eduardo Barros wrote:
> > I think debconf should use UTF-8 for the templates and recode on the fly.
> Well, if you send in a patch, I will consider it.

Probably libc ought to support it (when there is all the i18n stuff
built in). Maybe it already does, who knows...

> > There's nothing worse than having gibberish in ten different charsets in the
> > same template file.
> This is why the template file is the "compiled" form. You're intended to
> merge together a bunch of files that just have one (or 2, with English)
> languages in them. See debconf-mergetemplate(1).

even so...

For the beginning I would propose something like this to go into policy:

Documentation of debian packages, if written in language requiring
characters outside of 7-bit ASCII range, should use either well-established
encoding for the given language (such as ISO-8859-2 for some central- and
easter europe languages, KOI8-R for Russian, JIS for Japanese etc...),
or UTF-8 encoding. Maintainers are being encouradged to use UTF-8, having in 
mind the general tendency toward unified character encoding.

Original upstream documentation, if in encoding other than UTF-8 _or_
the well-established encoding for the particular language, should be 
converted either to UTF-8 or to the well-established encoding.
Choice between UTF-8 and other encoding is left at the maintainer 
discretion, however, one package should (must?) have all the documentation 
in one consistent encoding.

Names of maintainers, upstream authors and other data in packages'
descriptions and related data files (such as debian/changelog,
debian/control), as well as in English language documentation, should be
either transliterated or transcribed to ASCII, or used in UTF-8 encoding
- again, at the discretion of the maintainer. However, for names in
scripts based on non-latin alphabets, ASCII (or suitable latin-script)
version should be provided along with original name.

