[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How much utf-8 do we accept in control files?



On Tue, Apr 06, 2004 at 12:21:57PM +0100, Roger Leigh wrote:
> Siggy Brentrup <bsb@debian.org> writes:
> 
> > On Mon, Apr 05, 2004 at 10:43:59PM +0200, Andreas Barth wrote:
> >
> >> Furthermore, some maintainers use already utf-8 for their name (this
> >> means in debian/control and in the changelog).
> >
> > While understandable from the maintainer's point of view, luckily to
> > my knowledge no (e.g.) asian maintainer has done it yet.  If we allow
> > non ascii in control fields, I see no valid argument to prohibit any
> > character set.
> 
> The whole point of UCS/UTF-8 is to eliminate all other character sets
> by providing a *universal* character set, so allowing *any* charset
> would not provide a usable system, since you could not know which
> charset a given control file/changelog would be encoded with.
> Standardising on UTF-8 is the only way to go, other than sticking with
> ASCII/ISO-8859-1 (which are obviously inadequate).

Fine, that's the encoding side but how do you expect e.g. dpkg to present
UTF-8 encoded control fields to the user?  If we allow (invented name)
»Jörg Müller« to see his name correctly displayed we must also allow 
hebrew, kyrillic, arab, asian etc. names to be correctly displayed.

Regards
 . Siggy

ps: maybe I'd argue otherwise if my name contained »Umlauts« :)

Attachment: signature.asc
Description: Digital signature


Reply to: