[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: charsets in debian/control



Daniel Burrows <dburrows@debian.org> writes:

> On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote:
>> > Would Peter permit me a mild dissent?  I prefer Latin-1.  Reason: I can
>> > recognize and distinguish Latin-1 characters, even when I do not always
>> > understand the words they spell.  Recognizing and distinguishing the
>> > characters is important to me.  And not just to me.  Imagine the dismay
>> > of a Korean user trying to read Arabic script in a control file.
>>
>>  But the only field in UTF8 should be Maintainer, and that field should
>> have (IMHO) also a roman transliterate for the name, if you don't use a
>> latin charset (Greek, Arabic, Japanese, Chinese...)
>
>   Well, when aptitude gets UTF8 support, it'll decode all the control fields 
> that are mainly meant for human consumption: that means at least Description 
> in addition to the Maintainer field, and maybe also Section.

I think the only field in UTF-8 in the main (english) Packages file
should be the maintainer field. There might be some discussion about
allowing the packages name in the description to be native too but I
wouldn't like that.

Now, for translated Packages files, like a chinese one, only the
description should change.

>   I don't see any reason to limit ourselves in the long term by sticking to 
> Latin1 (or ASCII) just because none of us can read all of the languages that 
> are available in the extended UTF8 namespace.  If we want people to stick to 
> certain subsets of UTF8, that should be determined in Policy, not the 
> packaging software.

The software has to be able to work with translated Packages file. It
would be quite unacceptable for aptitude to show gibberish in the
description for a chinese user with a translated Packages file. So
there realy should be no limit there.

But limiting each Packages file to the subset of characters
recognisable in that language sounds like a good idea. Chinese user
probably don't want japanese in their Packages file and vice versa.

Seeing that english is the common language in Debian I would also say
that an english description is a must.

>   If you want a practical concern (aside from, say, a general suspicion of 
> building policy into software tools), consider these cases:
>
>   -> Someone wants to translate the Description fields of all packages in 
> Debian into Chinese or Arabic.  What will they do if the package tools only 
> support Latin-1?
>
>   -> Someone wants to use the Debian packaging tools to create a new 
> distribution for use in China.  Again, what will they do if the package tools 
> only support Latin-1?
>
>   Daniel

You are absolutely right, the tools should cope with everything with
the possible exception of warning/rejecting policy violations on
upload.

MfG
        Goswin



Reply to: