[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#99933: second attempt at more comprehensive unicode policy


On Thu, Jan 02, 2003 at 05:25:15PM -0500, Colin Walters wrote:
> +	    Programs should expect filenames in general (whether from
> +	    a Debian package or created by the user) to be encoded
> +	    with UTF-8, although it is recommended for programs to try
> +	    gracefully falling back to the current locale's encoding
> +	    if this fails.  Programs included in Debian packages
> +	    should, when creating new files, encode their names in
> +	    UTF-8 by default.

Is this meant to apply to programs like "ls", "bash", "touch", and
"emacs"?  I imagine that the transition period could be a hard time
for users who (like me) use non-ASCII characters in file-names.

As I see it, the current (broken ?) behaviour is, to use the user's
locale setting (LC_CTYPE) to encode file names.  During the
transition period non-ASCII file names will have two possible
representations in the file system (LC_CTYPE vs. UTF-8).  I think
we should clarify the following points before introducing the above
into policy:

    1) Should interpretation of existing files' names as UTF-8
       be implemented before the encoding of newly created files'
       names is switched?

    2) How should already existing files with non-ASCII names
       be converted?

What do you think?

Attachment: pgpiFbXZwJGNa.pgp
Description: PGP signature

Reply to: