Re: Debian Metadata Standard -- proposed

On Jul 20, "Adam P. Harris" <apharris@burrito.onshore.com> wrote:
 >> On Jul 19, "Adam P. Harris" <apharris@burrito.onshore.com> wrote:
 >>> I'd appreciate it if someone in the know about character sets and
 >>> typical Debian application support for the same could comment on
 >>> this spec and offer ways to make it better for multiple character
 >>> sets, if
I quickly read it (I hope I did not miss any important thing) and I think
there is not enough support for foreign languages: just sticking to
Latin-1 will cut out many, many, people, mostly russian and asian

I think there are two possible solutions:
- use Unicode (probably UTF-7 encoded like the kernel does for ext2)
and display it using the kernel console driver or a unicode xterm, or
- simply be 8 bit clean and put in EVERY field informations about the
encoding used (it could be latin-1 for west european languages, KOI-8
for russian, BIG-5 for chinese and so on) and program the user interface
to convert the encoding to the one used by the console (this is not
a trivial task).

(I think the first solution is the best and easier to implement.)

Please also remember that there is a big difference between stating that
field X contains "ISO-8859-1 characters" and stating that field X
contains "text ISO-8859-1 encoded". The actual content is the same, but
in the text of the first example cannot be interpreted unless the
user interface assumes something about the encoding (i.e. that those
bytes represents glyphs accordingly to the ISO-8859-1 encoding).


