[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: support for multilingual Packages files?



On Mon, Jul 30, 2001 at 11:17:41AM +0200, Radovan Garabik wrote:
> 1) if foreign language descriptions are permitted in Packages
> Take for example lyskom-tty-client, which is interesting only
> for swedish speaking people, why should it not have description
> in Swedish then?
> 
> There was not clear consensus about the whole issue, only about one
> thing: Packages must have description in English, so that administrators
> know what are they about. It was undecided if the description can have
> also other language part (maybe even more detailed than the english one).

Now you can put a full translation of the english description at the
bottom of the description. 

We are starting to make translation of all description now. 

If we have some translation, you can remove this translation of the
description in the description.

> 2) Localized fields in debian/control, such as Description-fr etc.
> This is a different issue than 1), and has not been much discussed.
> Probably the same way as debconf follows could be adopted.
> Notice that even in English, there is an occasional need for
> diacritics.

IMHO the packages, the crontroll file and the Package files are to big
with this. We need a better system!

Now we translate the description and save the translation in a
database. (it is not a real database, but this it not your problem).
With this db and a Packagefile we make daily new translated Packages
files. 

see i18n@lists.debian.org, auric.debian.org/~grisu/ddts,
http://www.laespiral.org/proyectos/debian-es, and other sites.

> 3) Most controversal part: what encodings are permitted in Packages
> (and related files, such as debian/changelog...)

debian/changelog is not the problem. This file will never translated.

> There are these main possibilities:
>   a) mandate ASCII only.
>      advantages: 
>      - it works as intended everywhere, no matter what your
>        locale is (since ASCII is an intersection of all other locales)
>      - is easy to maintain
>      disadvantages:
>      - if 1) is accepted, then ASCII is clearly insufficient
>      - there is no way you can have proper maintainers' names
>        if the encoding is ASCII. Some languages (Japanese, German) have 
>        standartized way of transcribing names into ASCII, some
>        others (Russian, Slovak, Hungarian) have not. Some other 
>        non-latin-script-based languages (Serbian, Chinese) have standartized
>        way of transcribing names into latin script with diacritics.      [1]
>      - for 2), one can assume there will be one original Packages file,
>        and a groups of people will be translating it into a target language.
>        Once the names are incorrect (ASCII only) in original Packages,
>        there is no easy way to put them corerct into translated Packages,
>        even if the required diacritics or script is present in target 
>        language.

we make this. We make a translated Packages file, but we do only
change the 'Description:', not the other fields.

But you can translated 'names' the same way. Make a table with the
corerct spelling and translated this too.

>   c) leave the situation as it is today: no encoding is specified,
>      maintainers who feel the need to put non-ASCII characters there
>      just put them there in an encoding which seems natural to them.
>      advantages: 
>      - if the administrator has by a chance the same locale charset as
>        the maintainer, he will see the text as intended
>      - easy work for the maintainers
>      disadvantages:
>      - it is a mess. In order to see the name (or description) properly,
>        you have guess the encoding and set up your console for it
>      - nobody in the world can see Packages properly, since there is
>        no common console setting.
>    d) this was just briefly suggested and not discussed:
>       use mime headers to specify charset
>       Ugh... please, we really do not want to put this into dselect, do we?       
>       (and besides, once that recoding is put into dselect, we may as well
>        stick to utf-8 and save some additional effort)

I regard now only the Package file. In the Package file we have only
two importend fields with translation/encoding: 
 - Maintainer
 - Description

I propose: 
 - use only ASCII in the (normal) Packagefile
 - make a database with translation of the Maintainer's and
   Description (the Description is imho more importent). 

With the database you can: 
 - make Packages-XX files (only one languages)
 - make a big Packages file with all Translation (with Description-XX
   fields)
 - change encoding 
   (we can now use one encoding per translation (latin1 or others) and 
   if we use utf-8 ever, we transform the translation per script.

One Packages-XX is now already working. If you add 
	deb http://gluck.debian.org/~grisu/ddts/aptable de/sid main
to sources.list you have (some) translated description. 


Gruss
Grisu
-- 
Michael Bramer  -  a Debian Linux Developer http://www.debian.org
PGP: finger grisu@db.debian.org  -- Linux Sysadmin   -- Use Debian Linux
"Verletzungen der Kausalität ermöglichen Voodoo und Windows." 
                                         -- Lutz  Donnerhacke in dasr

Attachment: pgpn_hA0DYeq1.pgp
Description: PGP signature


Reply to: