[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Using XML for boot-floppies (was Re: Debian Boot Floppies CVS: boot-floppies polish)



On Fri, Nov 19, 1999 at 12:49:16PM +0300, Michael Sobolev wrote:
> So for preparing Russian file I would need to convert it from KOI8-R (as I
> use this character set) into UTF-8 and only then try to validate the
> resulting file.  I have no problems with that.  Should I do it that way?
Well, I've got another problem.

Let me explain (partially this text will go into bf-localization.sgml).

I wanted to have every language two names:

    name attribute -- English name for the language
        should go into menu (yes, yes, at the moment implementation of
        lc is a bit different, but this will be changed)
    name child of language element -- language name in this very language
        this language should be displayed using font and acm that are
        attributes for language element.

Then I wanted every item (list/item) of the top-level list (the immediate child
of language element) to be in one language only.  For example, for French (BTW,
I do not understand why french.xml is valid: it uses ISO-8859-1, which, I
believe, is not a subset of UTF-8), we have:

 0: <language name="French" font="LatArCyrHeb-14" acm="iso-8859-1">
 1:     <name>FranГais</name>
 2:     <hint>Vous avez choisi le FranГais.  Tapez sur EntrИe pour continuer</hint>
 3:     <list>
 4:         <name>Choisissez une variИtИ</name>
 5: 
 6:         <item locale="fr_CH" acm="iso-8859-1" font="LatArCyrHeb-14" keymap="fr" msgcat="/etc/messages.fr.trm">
 7:             <name>FranГais (Suisse)</name>
 8:         </item>
 9:         <list>
10:             <name>FranГais (France)</name>
11: 
12:             <item locale="fr_FR.ISO8859-1" acm="iso-8859-1" font="LatArCyrHeb-14" keymap="fr" msgcat="/etc/messages.fr.trm">
13:                 <name>FranГais (France) -- ISO-8859-1 (Latin-1)</name>
14:             </item>
15:             <item locale="fr_FR.UTF-8" acm="utf-8" font="LatArCyrHeb-14" keymap="fr-utf" msgcat="/etc/messages.fr.utf.trm">
16:                 <name>FranГais (France) -- UTF-8</name>
17:             </item>
18:         </list>
19:         <item locale="fr_CA" acm="iso-8859-1" font="LatArCyrHeb-14" keymap="fr" msgcat="/etc/messages.fr.trm">
20:             <name>FranГais (Canada)</name>
21:         </item>
22:     </list>
23: </language>

Line 0 has font and acm attributes to be used for all textual information in
descendant elements (name in 1, hint in 2, name in 4, etc).  font, acm, locale,
keymap, and msgcat attributes are attributes of the *result*.  That means that
while the user using lc (or similar) program, she is presented the information
using information taken from font and acm attributes of the appropriate language
element (in 0th line in our case).

Using UTF-8 means one of two possibilites:
    -- the default charset for dbootstrap becomes UTF-8
    -- somewhere UTF-8 should be converted back to what is suitable for
       this language.

I do not like the first possibility (as I am not sure s-lang works fine with
UTF-8, yes, I heard that there are certain patch for s-lang to make it working
with UTF-8, but it's not main line yet).  So it looks like pl.py program should
take the source and convert it according to the the specified charset.  Does it
sound OK?  If yes, I will implement it that way...

--
Mike


Reply to: