Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

Package: debian-policy
Severity: wishlist

Apparently the debian-policy currently says nothing about the characters
used in filenames contained in binary packages. Most packages use common
sense and only use a small subset of US-ASCII. In Debian sid main most
filenames can be represented using the following subset of US-ASCII
characters (written as a regular expression):

	[][a-zA-Z0-9{}<>() ^/,=:&!*%#$~@+._-]

The number of exceptions is about 200 contained in about 50 binary
packages. In those packages some filenames are not representable as
UTF-8 (for example aspell-is) and others don't make any sense in
ISO-8859-15 (for example ca-certificates).

It would be nice if some common ground concerning filename encoding
could be reached. The options range from a rather restrictive definition
of acceptable characters via requiring filenames to be representable in
US-ASCII to mandating a particular encoding (such as UTF-8). This could
be first introduced as a SHOULD and later turned into a MUST.

Personally I do not really care about what the precise restriction is as
long as it permits a mechanical transformation to unicode.


