Re: Status of UTF-8 Debian changelogs

On Sat, Jun 07, 2003 at 04:21:33PM +0100, Colin Watson wrote:
 DB>> I am using KOI8-R terminal which can not display Latin-1
 DB>> characters,
 CW> Where did Latin-1 come into this?

I said characters, not encoding, and I mean that KOI8-R character set
does not include characters from Latin-1. Therefore, these characters
need to be replaced with '?', as you point out below.

 CW> What do you lose here? Those who have fonts that can display the
 CW> character in question will be able to do so; those who don't won't,
 CW> but will see some reasonably obvious indicator like a "?" or a
 CW> filled-in square to show that the character is one they can't
 CW> display. This is superior to the situation where those who don't
 CW> have such fonts just see some gibberish.

I don't see it as a proper credit to your contributors if their name
appears as 'J?rg?n' (or even '????' in case of Kanji) on my display.
Were it transliterated, I would at least be able to pronounce it (and
there are standard rules for such transliteration anyway (I even think
iconv should have an option to do lossy transliteration for characters
outside of target character set)).

 DB>> I'd rather have 7-bit ASCII changelogs: why Latin-1 users are
 DB>> privileged to use native spelling of their names, while Cyrillic
 DB>> and Kanji and other users have to resort to transliteration?
 CW> They aren't so privileged. They may decide to do it anyway, but
 CW> since the encoding of changelogs is not yet specified you currently
 CW> take pot luck on anything outside 7-bit ASCII.

What I objected to is that they may: I'd rather they may not. I'd rather
encoding of changelogs was specified to be 7-bit ASCII.

 CW> I believe you've just contradicted yourself, anyway. Nobody wants
 CW> to have to transliterate their name.

Excuse me for ad hominem, but how many foreign languages do you speak?
The reason I'm asking is that my observation is that people from
countries with completely non-ASCII writing system (as opposed to
European Latin-based languages) almost always do transliterate their
names when they communicate with someone speaking a different language.
Do you observe a different pattern?

You see, it is not only a technical issue, it is a communication issue.
If you can't read Cyrillic, native spelling of my name wouldn't help you
to read it, even if it is displayed correctly.

 CW> Package maintainers who aren't set up for writing UTF-8 can always
 CW> resort to transliteration into ASCII if need be.

The biggest compromise you can convince me to with that argument, is to
allow to put non-ASCII names in UTF-8 into changelogs, but only if such
name is accompanied by ASCII transliteration. But that solution is
substantially more complex than just limiting changelogs to 7-bit ASCII,
and there is no easy way to check for compliance.

Dmitry Borodaenko

