[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: support for multilingual Packages files?



On Sat, Jul 14, 2001 at 07:27:28AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> At Fri, 13 Jul 2001 15:18:33 -0500 (CDT),
> Steve Langasek <vorlon@netexpress.net> wrote:
> 
> > When we're able to provide users with tools that can handle UTF-8 effectively
> > at every turn, *then* we can discuss full-Unicode in the packages file.  But
> > *today*, we're not ready for that.  We should therefore focus on getting the
> > software working, first.
> 
> Well, UTF-8 will be mere one of many encodings which future perfect Debian
> will support.  Thus, default messages (Description:, Maintainer:, "MSGID"
> in message catalogs, and so on) have to be written in ASCII, which is the
> common part of all encodigs.  If you want to use accented alphabets and so
> on for English, please use Description-en: and so on so that it is used
> only under English locales.

I'd prefer it other way round, default messages would be in UTF-8 which 
is a union of all the encodings, and use Description-en for ASCII only
English locales.

> 
> Thus, even in future when UTF-8 support will be fully implemented, we
> should use ASCII for default messages.
> 

This is the main point where we disagree.
I am glad we finally pinpointed this out.


On Sat, Jul 14, 2001 at 07:49:04AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> At Fri, 13 Jul 2001 18:19:11 +0200,
> "J?rgen A. Erhard" <juergen.erhard@gmx.net> wrote:
> 
> > Even replacing ? with ue can be incorrect... because ue is not
> > necessarily ? (Goethe is Goethe, not G?the, and when I see "Moeller" I
> > don't know whether that's "M?ller" or in fact "Moeller")
> 
> I read many question marks...  Please don't use (I think) ISO-8859-1

He did include proper Content-Type, and used quoted-printable
encoding.
So, his message was in plain ASCII after all :-)

And, it is kind of difficult to discuss proper usage of
german umlauted letters and not writing them....
That's why I am in favour of implemnting full unicode support - 
people would be able to exchange such mails like this without
problems (wouldn't you like to?)

> character here just as I don't use Kanji here.  (I know almost people
> in the world cannot read Kanji and, more importantly, many people in
> the world use Kanji-disabled mail and terminal softwares.)
> 
> 
> > This is completely separate from the Kanji->ASCII translation.  A
> > comparable example would be that you (yes, I mean you, Tomohiro ;-)
> > had to replace one Kanji with two other Kanji that were similar in
> > (combined) meaning to the Kanji you replaced.  And not ("just")
> > transliterating it in a completely different character set.
> 
> Yes, completely separated.  However, in different meaning.  For Japanese,
> we have a systematic way to write our names in ASCII characters.

Carefully here.
Most languages do not have a systematic way to write names in ASCII.
Slovak (and Hungarian) certainly does not.
The most "semi-official" way of transcribing Russian names
(used by USA Congress library) uses diacritics over latin letters(!)
(and no, you cannot just strip them down - it changes the
pronunciation completely)

In a way, you Japanese are lucky :-)

> (Otherwise I cannot submit my scientific paper to international
> journals, which mean I cannot do my work!)  I also know there are
> Debian developers who use ASCII characters for their names in
> Maintainer: field, GPG key ID, and e-mail signatures from China,
> Korea, Thai, Russia, and Israel.  Though we are not very happy,
> we accept this situation.
> 
> The cited opinion insists that Hungarian people cannot accept
> to write their names in ASCII characters.  This is the problem!

Well, of course, why should they? It is their language, their
writing system, why should they be forced to write their names in a
different way? (and, unlike Romanji, in a way which seems like
an inferior subset of their own alphabet)

For example, I would have nothing against transcribing my
name into katakana for some japanese information system.
(Neither had I anything against transcribing my name into
cyrillic, when I was in Russia). But I do not like leaving
diacritics out of my name, when being forced to write it
in plain ASCII. And I am sure many other people in my position
feel the same.
It is comparable to the situation when you would be forced to change
some characters from your name written in hiragana, so that it has
mostly similar, but not the same pronunciation, and fits into some
subset of proper hiragana, just because the computer system you are
using is limited to that subset.

-- 
 -----------------------------------------------------------
| Radovan Garabik http://melkor.dnp.fmph.uniba.sk/~garabik/ |
| __..--^^^--..__    garabik @ melkor.dnp.fmph.uniba.sk     |
 -----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!



Reply to: