[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How much utf-8 do we accept in control files?



Siggy Brentrup <bsb@debian.org> writes:

> On Tue, Apr 06, 2004 at 12:21:57PM +0100, Roger Leigh wrote:
>> Siggy Brentrup <bsb@debian.org> writes:
>> 
>> > On Mon, Apr 05, 2004 at 10:43:59PM +0200, Andreas Barth wrote:
>> >
>> >> Furthermore, some maintainers use already utf-8 for their name (this
>> >> means in debian/control and in the changelog).
>> >
>> > While understandable from the maintainer's point of view, luckily to
>> > my knowledge no (e.g.) asian maintainer has done it yet.  If we allow
>> > non ascii in control fields, I see no valid argument to prohibit any
>> > character set.
>> 
>> The whole point of UCS/UTF-8 is to eliminate all other character sets
>> by providing a *universal* character set, so allowing *any* charset
>> would not provide a usable system, since you could not know which
>> charset a given control file/changelog would be encoded with.
>> Standardising on UTF-8 is the only way to go, other than sticking with
>> ASCII/ISO-8859-1 (which are obviously inadequate).
>
> Fine, that's the encoding side but how do you expect e.g. dpkg to present
> UTF-8 encoded control fields to the user?

dpkg wouldn't need to do anything.

> If we allow (invented name) »Jörg Müller« to see his name correctly
> displayed we must also allow hebrew, kyrillic, arab, asian
> etc. names to be correctly displayed.

Agreed.  This won't require special handling, since if you're using a
UTF-8 aware console or terminal emulator then this will "just work".
For me, this is a framebuffer console, gnome-terminal or uxterm.  All
these can display UCS characters if the font has glyphs for all the
corresponding code points.

I believe the issue is the availablity of working UTF-8-aware
terminals (not an issue now, apart from the framebuffer only
supporting 512 glyphs IIRC), and the availability of decent fonts with
support for big chunks of the BMP.  The font issue is now the main
problem, IMHO, but surely existing fonts can be recoded?

Some fonts are currently very poor.  For example, the Bitstream Vera
family doesn't even provide a HYPHEN!.  This makes most manpages quite
hard to read!


Regards,
Roger

-- 
Roger Leigh

                Printing on GNU/Linux?  http://gimp-print.sourceforge.net/
                GPG Public Key: 0x25BFB848.  Please sign and encrypt your mail.



Reply to: