[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Design of new debconf system (Re: Accepted po-debconf 0.2.2 (all source))



Hi,

At Mon, 16 Sep 2002 22:01:32 +0900,
Junichi Uekawa wrote:

> > Agreed, that's why adding an Encoding-xx: header field in all templates fits
> > our needs without breakage.
> 
> I think it doesn't.
> 
> Description-XX_XX.XXX: is a better way to put it.

I don't understand the difference (i.e., merits and demerits) between
Encoding-xx: header and Description-XX_YY.ZZZ: header.  At first,
where are they written?  In translated templates in source pakcages?
*OR*(/and), in /var/lib/dpkg/info/*.templates files?

(I think "Encoding-xx:" is typo of "Encoding: xx".)


Now, I think this two-level conversions.


    translators' templates in source packages
       |
       |   compile-time conversion
       v
    /var/lib/dpkg/info/*.templates
       |
       |   run-time conversion
       v
    texts used for output


Now, requirements are:

1. /var/lib/dpkg/info/*.templates should be compatible to the current
   version of Debconf system, because it is impossible for all debconf-
   using softwares to migrate into new /var/lib/dpkg/info/*.templates.
   I.e., currently existing /var/lib/dpkg/info/*.templates should be
   regarded that they are written in "popular encoding for each language"
   by future debconf package.

2. Translators must be able to write their texts in "popular encoding
   for each language", just like they can do it now.

3. Translators should be able to write their texts in UTF-8 or whatever
   encodings translators prefer.  Thus, format of translators' templates
   must has a way to specify the encoding of the text, like .po files.

4. Various encodings should not be mixed in /var/lib/dpkg/info/*.templates
   files.  This conflicts with 1, because the current existing
   /var/lib/dpkg/info/*.templates are mixture of encodings.

5. The result of run-time conversion must obey users' LC_CTYPE locale.

6. Any other requirements?

* "popular encoding for each language" means encodings which current
  version of Debconf system (or, precisely, translators for each
  language) implicitly uses for translated templates.



Roughly classifying, there are three possibilities.

A. No compile-time conversion.  Run-time conversion is from translators'
   encodings to users' LC_CTYPE encodings.  

B. Compile-time conversion to one specific encoding.  UTF-8 is the only
   candidate.  Then, run-time conversion is always from UTF-8.

C. Don't change /var/lib/dpkg/info/*.templates format.  This means that
   /var/lib/dpkg/info/*.templates are always written in "popular encoding
   for each language".  Compile-time conversion is to "popular encoding
   for each language" and run-time conversion is from "popular encoding
   for each language".


C is the most compatibility-biased way.  However, I don't think this
is the future-promised method because this limits range of characters
which can be used.  Imagine, in future, many people use UTF-8 including
uses and translators.  A French translator wants to use Greek character,
because the translator knows well that most French users' Debian box
can display it.  However, UTF-8 -> ISO-8859-15 -> UTF-8 compile-time
conversion will inhibit it.  (Thus, the compile-time conversion must
be either no-conversion or conversion to UTF-8.)

B seems to more simple, but it means that 1 (compatibility) is not
achieved.  However, A means that /var/lib/dpkg/info/*.templates will
never stop being mixture of various encodings.  This is, 1 vs 4 problem.
I think compatibility (1) is an absolute requirement.  However, provided
that we keep the compatibility, we can promote solution of 4.  It can
be done like following, which is mixture of A and B:

(a) Determine a new format for UTF-8 /var/lib/dpkg/info/*.templates .
    It can be "Description-XX_YY.ZZZ" field, "Encoding: xx" field,
    or any other.  It can even be gettext .mo format.

(b) Improve debconf (run-time) to be able to handle *both* of current
    /var/lib/dpkg/info/*.templates in "popular encoding for each
    language" and new UTF-8 /var/lib/dpkg/info/*.templates .  Assume
    the version of improved debconf to be version 3.1416 .

(c) Modify dh_installdebconf to generate new UTF-8
    /var/lib/dpkg/info/*.templates .  By using this version of
    dh_installdebconf, the generated package will Depends: on
    debconf (>=3.1416).

(d) Determine a new format for debconf templates in source packages
    to specify encodings.  "Description-XX_YY.ZZZ" field, "Encoding: xx"
    field, .po file format, or any other can be candidates.  It should
    be documented, as well as "popular encoding for each language" as
    default encodings.

(e) Modify dh_installdebconf to recognize the (d) format.


(a), (b), and (c) must be done in this sequence.  (d) and (e) also must
be done in this sequence.  However, (e) cannot go ahead of (c).
Please read carefully not to confise (a) and (d).

In (a), "ZZZ" or "xx" must be "UTF-8".  If we don't limit encodings
to UTF-8 (to stop mixture of /var/lib/dpkg/info/*.templates), there
are no merits to modify of format of /var/lib/dpkg/info/*.templates .

(b) assures compatibility (1).  Under this condition, (c) promotes
migration into UTF-8-based non-mixture /var/lib/dpkg/info/*.templates.

How about this sequence of migration into new Debconf system?




Thus, I don't understand difference between "Description-XX_YY.ZZZ"
idea and "Encoding: xx" idea.  Or, Am I misunderstanding something?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: