Re: support for multilingual Packages files?
On Mon, Jul 30, 2001 at 01:53:57PM -0500, Steve Langasek wrote:
> And what about the problems UTF-8 will cause for people who do not
> (or cannot) use UTF-8 consoles?
Just recode the characters to whatever charset is available on the
terminal used. A clever recoding could even replace undisplayable
diacritical characters to ones displayable in ASCII (é -> e,
ö -> o). Some characters will have to be displayed as question marks,
such as for example Chineese characters on my terminal.
> I concede that it's useful to be able to represent Maintainer names
> in full Unicode; that is not in question. What I disagree with is
> the argument that such non-ASCII characters should be included in
> existing fields of the Package file.
>
> If all Unicode is limited to new fields that we introduce into
> Packages, there's a very simple mechanism that we can use to provide
> backwards compatibility with even the most rudimentary of ASCII-only
> tools:
>
> $ grep -vE '^(Description.+|Maintainer-utf8):' < Packages > Packages-ascii
This might fail on UTF-8. The ASCII code for newline may be part of a
multibyte character (I think), so part of a description may be left.
However, you should be able to:
recode utf-8..ascii <Packages >Packages-ascii
This would even work if the original fields were used. (But may add
question marks. I would suggest that Chineese etc. maintainers add a
transliteration of their name in parenthesis after the original name.)
--
Niklas
Reply to: