Always use UTF-8 when running base-config?

To: debian-boot@lists.debian.org
Subject: Always use UTF-8 when running base-config?
From: Petter Reinholdtsen <pere@hungry.com>
Date: Sun, 07 Sep 2003 16:38:19 +0200
Message-id: <[🔎] E19w0gF-0004fO-00@minerva.hungry.com>

I've been thinking how we should handle languages not using the
default charset in the linux console, when base-config is executed as
the second stage installer.

The current approach is to add support for new charset in
/usr/sbin/termwrap, and try to make sure this work for all the
different charset in use.  This might work, but is quite a bit work,
and the testing each charset recieves is very limited.

Instead, I suggest we always use UTF-8 while running base-config, this
way making sure base-config work for all the languages we need to
display, instead of only the ones that recieved most of the testing.

If we keep the requested locale in LANG, but uses an UTF-8 locale in
LC_CTYPE, I believe the correct locale variables will be used (for
sorting, translations and such), while the charset printed will be
UTF-8.  I'm not sure if this is how it is supposed to work, but I
believe so.  Testing with date indicate that I might be mistaken, but
testing with the locale program indicate that I am not.  se_NO uses
UTF-8 charset, while fr_FR uses ISO-8859-1.

First I check the charset that should be printed:

  % LANG=se_NO locale charmap
  UTF-8
  % LC_CTYPE=fr_FR LANG=se_NO locale charmap
  ISO-8859-1
  %

Then I test with date, comparing the output using UTF-8 and (should
be) ISO-8859-1:

  % LANG=se_NO date
  sotnabeaivi, ÄakÄamÃ¡nu 07. b. 2003 16:32:46 CEST
  % LC_CTYPE=fr_FR LANG=se_NO date
  sotnabeaivi, ÄakÄamÃ¡nu 07. b. 2003 16:32:33 CEST
  %

As you can see, the charset of the printed string is not adjusted to
match the current charset.  Is this a bug in date/glibc, or is this
expected behaviour?

Anyway, if we can use UTF-8 as output charset while keeping the other
locale values from the locale we want to use, we can install
bogl-bterm and use this for all translations, instead of trying to
adjust the console charset or start the appropriate terminal wrapper
based on the charset used by a given locale.  This would make it a lot
easier to make sure base-config always display the translations.

Reply to:

Prev by Date: sarge netinst with an eMachines M5310
Next by Date: Resultats de vos commandes
Previous by thread: sarge netinst with an eMachines M5310
Next by thread: Re: Always use UTF-8 when running base-config?
Index(es):
- Date
- Thread