Re: Bug#99933: second attempt at more comprehensive unicode policy

To: debian-policy@lists.debian.org
Cc: 99933-maintonly@bugs.debian.org
Subject: Re: Bug#99933: second attempt at more comprehensive unicode policy
From: Colin Walters <walters@debian.org>
Date: 17 Jan 2003 20:27:40 -0500
Message-id: <[🔎] 1042853259.5819.27.camel@space-ghost>
In-reply-to: <[🔎] 87u1g74cj3.fsf@glaurung.green-gryphon.com>
References: <[🔎] 1041476827.25298.32.camel@space-ghost> <[🔎] 20030102181206.GA24191@atlas15.dnp.fmph.uniba.sk> <[🔎] 1041533855.15063.19.camel@space-ghost> <[🔎] 1041546314.22038.9.camel@space-ghost> <[🔎] 87u1g74cj3.fsf@glaurung.green-gryphon.com>

On Fri, 2003-01-17 at 17:49, Manoj Srivastava wrote:
> Hi,
> 
>         Sorry for the late entry into the discussion. I am
>  comfortable with making the changelog UTF-8 only, but file names in
>  pure UTF-8 perhaps is premature. (मनोज्.conf, anyone?). 

Please see my second proposal (the third in #99933), which drops the
recommendation for programs to create and read filenames in UTF-8.

Of course, this doens't make the problem go away; we will still have
some programs creating filenames in UTF-8, and others in the locale
charset.

> Indeed,
>  until we have a wider deployment of a font that has a decent
>  coverage of UTF-8 glyphs (haw many of y'all can read  ሰማይ አይታረስ ንጉሥ
>  አይከሰስ። ?), 

I admittedly can't; Evolution will have somewhat poor support for
non-Latin Unicode until it's ported to GNOME 2.  But note that UTF-8
will work quite well I think for users of Latin and East Asian
languages, because we do have good, widely available free fonts for
those.

> perhaps we should stick to pure ascii file names, if we
>  must have policy take a stance about file names at all?

First of all, I strongly believe policy should have a stance about file
names.  People will want to have packages including filenames with
include non-ASCII characters.  There are something like 15-20 in Debian
now, and that number is probably small because of this encoding mess. 
And if those packages want to, we need a defined encoding for doing so. 
I think it is pretty obvious that UTF-8 is the only sane choice.

Second, people will want to create files with non-ASCII names on their
own computers; it would be bad policy specifed one charset, but users
were creating files in another.  But we can leave this issue aside for
now.

> 	That is not saying anything about programs that deal with
>  file names having widechar and encoding support, etc. I feel, as
>  integrators, we must follow, rather than lead, the majority of the
>  producers of the software components we integrate. 

I understand your position.  In my latest proposal, policy is silent on
the encoding for file names to be used by programs in general.

We can fill that in later (and I think we will be filling it in with
UTF-8), but I'd really like to set up the Unicode infrastructure in
policy now.  This will also have the effect of letting people know our
intentions now, and hopefully spark a few upstream authors into adding
Unicode support.

Reply to:

Follow-Ups:
- Re: Bug#99933: second attempt at more comprehensive unicode policy
  - From: Manoj Srivastava <srivasta@debian.org>

References:
- Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded
  - From: Colin Walters <walters@debian.org>
- Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded
  - From: Radovan Garabik <garabik@melkor.dnp.fmph.uniba.sk>
- Re: Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Colin Walters <walters@debian.org>
- Re: Bug#99933: second attempt at more comprehensive unicode policy
  - From: Manoj Srivastava <srivasta@debian.org>

Prev by Date: Bug#177206: debian-policy: Typo in Debian Policy Manual section 11.7.5
Next by Date: Bug#177206: debian-policy: Typo in Debian Policy Manual section 11.7.5
Previous by thread: Re: Bug#99933: second attempt at more comprehensive unicode policy
Next by thread: Re: Bug#99933: second attempt at more comprehensive unicode policy
Index(es):
- Date
- Thread