Re: debconf w/ charset encoding support
Hi,
At Tue, 17 Sep 2002 14:30:14 -0400,
Joey Hess wrote:
> Debconf 1.2.0 has experimental support for encoded character sets. We
> will try to use UTF-8 encoding for everything in the templates files
> shipped with packages, but if that should not prove to be practical for
> some languages, it supports other encodings as well.
I tested this. At first, I configured debconf to use readline
interface, so that text handling of Debconf can be clearly understood.
1. manually edited /var/lib/dpkg/debconf.templates,
from "Description-ja" to "Description-ja_JP.EUC-JP".
2. in my daily EUC-JP terminal with LANG=ja_JP.eucJP ,
"dpkg-reconfigure debconf" works fine.
3. I configured UTF-8 terminal with LANG=ja_JP.UTF-8 .
"dpkg-reconfigure debconf" outputs funny string.
4. I checked Debconf::Encoding::convert. It seems work well.
(Note: in ja_JP.EUC-JP locale, the converter works
from "euc-jp" to "EUC-JP". Obviously this can be omitted
but the filtering code ($input_charmap ne $old_input_charmap)
doesn't work well.)
Thus, $ret in Debconf::Template::AUTOLOAD is also fine.
5. However, when I insert "print $ret;" in
/usr/share/Perl5/Debconf/Template.pm , it outputed funny
string in ja_JP.UTF-8 locale on UTF-8 terminal. I read the
funny string carefully and found that the string is ISO-8859-1
characters!
It means, if $ret="\xe8\xa8\xad" (the first Kanji in Description-ja),
three characters of e with grave (0xe8 in ISO-8859-1), umlaut
(0xa8 in ISO-8859-1), and hyphen (0xad in ISO-8859-1) are displayed.
Note that ISO-8859-1 and Unicode share exactly same codepoints.
I imagine some automatic Unicode conversion faculty of Perl might
be working. Thus, $ret="\xe8\xa8\xad" might be interpreted as
U+00E8 U+00A8 U+00AD. However, "use utf8;" is not found in any
of Debconf code and I have no idea why this occurs....
---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/
Reply to: