Bug#99933: second attempt at more comprehensive unicode policy

To: 99933@bugs.debian.org
Subject: Bug#99933: second attempt at more comprehensive unicode policy
From: Colin Walters <walters@debian.org>
Date: 06 Jan 2003 01:45:15 -0500
Message-id: <[🔎] 1041835514.15912.4.camel@space-ghost>
Reply-to: Colin Walters <walters@debian.org>, 99933@bugs.debian.org
In-reply-to: <[🔎] 20030106030032.GA1754@night>
References: <[🔎] 1041533855.15063.19.camel@space-ghost> <[🔎] 1041546314.22038.9.camel@space-ghost> <[🔎] 20030103231158.GB8502@tatonka.pfalz.de> <[🔎] 1041648625.21808.28.camel@space-ghost> <[🔎] 87isx4q588.fsf@orcus.priv.at> <[🔎] 1041700241.32717.35.camel@space-ghost> <[🔎] 20030105142317.GB1699@zobe.linuxfr.org> <[🔎] 1041786548.9879.8.camel@space-ghost> <[🔎] 20030105201303.GA23475@zobe.linuxfr.org> <[🔎] 1041819155.14620.9.camel@space-ghost> <[🔎] 20030106030032.GA1754@night>

On Sun, 2003-01-05 at 22:00, Richard Braakman wrote:

> Hmm.  Remember the far more common case of a program that takes a
> filename on the command line and then tries to open it.  The user
> would have typed it in the local encoding, so it needs conversion.
> On the other hand, if the program was invoked by another program
> then the filename is likely to already be in UTF-8.
> 
> I guess this conversion should be done by the user's shell, and all
> filename arguments on the command line should be encoded in UTF-8.
> Umm, except that the shell doesn't know which arguments are filenames.
> How should this be done?

Just to answer this a bit more directly; no, I think the shell should do
no conversion.  It should just pass its input on to programs in the
encoding it received it.

So for people using legacy encodings, yes, programs will receive
filenames in those encodings, not UTF-8.  But hopefully programs will
handle it, and convert them to UTF-8 internally, and write them out as
UTF-8.  But if they don't, then they don't (unless we fix the program). 
There's not much we can do about it, until switching users to UTF-8
locales and terminals.

Reply to:

References:
- Re: Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Jochen Voss <jvoss2@web.de>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Robert Bihlmeyer <robbe@orcus.priv.at>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: barbier@linuxfr.org (Denis Barbier)
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: barbier@linuxfr.org (Denis Barbier)
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Colin Walters <walters@debian.org>
- Bug#99933: second attempt at more comprehensive unicode policy
  - From: Richard Braakman <dark@xs4all.nl>

Prev by Date: Bug#99933: second attempt at more comprehensive unicode policy
Next by Date: Re: Bug#99933: second attempt at more comprehensive unicode policy
Previous by thread: Bug#99933: second attempt at more comprehensive unicode policy
Next by thread: Re: Bug#99933: second attempt at more comprehensive unicode policy
Index(es):
- Date
- Thread