Bug#701081: debian-policy: mandate an encoding for filenames in binary packages
Le Sat, Mar 02, 2013 at 04:38:49PM +0100, Guillem Jover a écrit :
>
> I'd second something like this, but I'd first like us to consider if
> we really want any non-ASCII characters in filenames. Currently on sid
> there does not appear to be many such filenames (64 from my check, if
> that's not bogus):
>
> $ LC_ALL=C zgrep '[^[:print:]]' \
> ftp.debian.org_debian_dists_sid_*_Contents-amd64.gz | wc -l
Hi Guillem and everybody,
I had a closer look at these files.
* There are dictionaries where the filename is the native name of the
language, like català, español, bokmål, etc. In all the case the
characters are valid Unicode. I think that it would be fair to allow
such cases.
* There are names that look rather arbitrary and replaceable
with ASCII alternatives if needed. For instance in python-pyramid,
usr/lib/python2.6/dist-packages/pyramid/tests/fixtures/static/héhé.html
* There are CA certificates with names like Certinomis_-_Autorité_Racine.crt.
Since I do not know how these certificates work, I do not know if they
can be renamed.
* There is a file that need to be in non-ASCII Unicode to fit its purpose:
usr/share/doc/console-tools/examples/♪♬ in console-tools. The package
also distributes a file called README.strange-name in the same directory.
* There are some more dubious names like 6Sze¶æ_Jab³ek.png in lletters-media,
or Miroir_Sphérique in optgeo. However, they do not cause much inconvenience
with a Unicode locale.
* The pitivi package gives entries with no obvious Unicode characters, like
usr/share/gnome/help/pitivi/C/figures/codecscontainers.jpg.
I think that we should at least strongly recommend that if a name looks ASCII
then it should be ASCII.
* Lastly, there seems to be only a single package that ships non-Unicode filenames,
non-free/ooohg with for instance 13_Afr d<U+0082>col.gif.
Requiring that all file and directory names are encoded in Unicode and
preferably in ASCII would therefore make only one package RC-buggy. Requiring
all-ASCII would be also possible with a bit more work, but I am not sure that it
would be worth the effort, as most of the current examples above do not require
specialised fonts. Altogether, there seems to be a good self-discipline.
However, if there are ways to test the following automatically, maybe we should
consider requesting that what is displayed ASCII should be ASCII.
Have a nice day,
--
Charles Plessy
Tsurumi, Kanagawa, Japan
Reply to: