Using UTF-8 (was Re: Debian Boot Floppies CVS: boot-floppies polish)

To: debian-boot@lists.debian.org
Subject: Using UTF-8 (was Re: Debian Boot Floppies CVS: boot-floppies polish)
From: Michael Sobolev <mss@transas.com>
Date: Sat, 20 Nov 1999 01:58:49 +0300
Message-id: <19991120015849.A2791@transas.com>
In-reply-to: <199911191226.NAA16245@ezili.sis.pasteur.fr>; from bortzmeyer@pasteur.fr on Fri, Nov 19, 1999 at 01:26:44PM +0100
References: <199911190942.KAA21485@ezili.sis.pasteur.fr> <199911191226.NAA16245@ezili.sis.pasteur.fr>

On Fri, Nov 19, 1999 at 01:26:44PM +0100, Stephane Bortzmeyer wrote:
> OK, here is my suggested way, using only packages found in potato:
> 
> - all the XML files in UTF-8 (by law, every XML processor must recognize it, 
> and it has upward compatibility with ASCII). If you cannot or don't want to 
> edit UTF-8, use recode (here, I assume you edited in Latin-2):
> 
>  recode latin-2..utf-8 polish.xml
(I'd prefer iconv)  This seems reasonable and this is what I have already done
for Russian (and for the time I finish this message, this is also done for
Polish :).

> - langs.c must be in UTF-8 for another reason: it mixes characters from many 
> languages.
Sorry, I do not quite understand your reasoning.

> - conversion from UTF-8 to the choosen charset needs to be done dynamically in 
> dbootstrap. librecode could help, but it would mean adding it to the rescue 
> disk.
Hmm...  This means that we need to add to the rescue disk the following items:

    -- messages for dbootstrap
    -- all necessary fonts
    -- acms for these fonts (actually these are tables for converting from local
       charset to utf-8 in case LatArCyrHeb is used)
    -- keymaps (in case we do want to enter anything localized in dbootstrap)
    -- librecode0 for translating from utf-8...

Correct?

I do not think that conversion should be performed by dbootstrap.  The reason
is quite simple: we have static (wrt dbootstrap) data and the charset is known
at compile time.  Why not make use of this knowledge?  Then UTF-8 has a big
drawback: the size.  For example, UTF-8 version of russian.xml file is 300
bytes larger than the KOI8-R.  I would not mind if everything we are to display
in dbootstrap were in UTF-8.  But!  The messages in *.po files are using
their's local charsets.  s-lang does not easily support UTF-8 (well, I do not
know for sure how well s-lang copes with UTF-8 :).

Hmm, (just found it) I do not know how to correctly work with character set
conversions in python...  Does anybody have any suggestions?

--
Mike

Reply to:

References:
- Re: Debian Boot Floppies CVS: boot-floppies polish
  - From: Stephane Bortzmeyer <bortzmeyer@pasteur.fr>
- Re: Debian Boot Floppies CVS: boot-floppies polish
  - From: Stephane Bortzmeyer <bortzmeyer@pasteur.fr>

Prev by Date: Debian Boot Floppies CVS: boot-floppies mss
Next by Date: Debian Boot Floppies CVS: boot-floppies mss
Previous by thread: Re: Debian Boot Floppies CVS: boot-floppies polish
Next by thread: Debian Boot Floppies CVS: boot-floppies polish
Index(es):
- Date
- Thread