Bug#701081: debian-policy: mandate an encoding for filenames in binary packages
On Sun, 2013-02-24 at 11:54:01 +0900, Charles Plessy wrote:
> This could be done by an addition like the following, after section 10.9
> (Permissions and owners). The wording is still a bit clumsy also, I am not
> sure if "installed" includes files created by maintainer scripts (which would
> be the intent here). I named the section "File names", and not "File name
> character set", in case we would add other restrictions (such as length) in the
To make the installed situation pretty clear, it might make sense to
say something along the lines: «the files that have been created after
the binary package is "Installed"».
> + <sec id="filenames">
> + <heading>File names</heading>
> + <p>
> + The name of the files installed by binary packages must be encoded in
> + UTF-8 and should be restricted to ASCII unless there is a justified
> + need for using other characters.
> + </p>
> + </sec>
> Some packages do not comply with the above. Given the pace of the releases
> of the Policy, I am not sure that it is worth having first a should and then
> a must, if you or somebody else would have the time to tackle the issue
> after the Wheezy release.
I'd second something like this, but I'd first like us to consider if
we really want any non-ASCII characters in filenames. Currently on sid
there does not appear to be many such filenames (64 from my check, if
that's not bogus):
$ LC_ALL=C zgrep '[^[:print:]]' \
ftp.debian.org_debian_dists_sid_*_Contents-amd64.gz | wc -l
> By the way, how about directories ?
This is a matter of terminology, directories are also filenames, and
part of pathnames, which point to a directory instead of a file. I
don't see why we'd want to exclude directories from filenames.