[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#704446: lintian: warn about filenames containing invalid UTF-8 sequences in binary packages



Package: lintian
Version: 2.5.10.4
Severity: wishlist

It would be nice if lintian could warn about the use of non-UTF-8
sequences in filenames contained in binary packages. There currently is
a policy bug #701081 with the likely outcome of making this mandatory.
Even if the policy bug report does not come to this conclusion, this
behaviour is already a defacto standard only violated by aspell-is and
jpilot at present (looking at sid main amd64). Since the vast majority
of packages uses a small subset of printable ASCII, lintian could go
even further and check for such a subset in a pedantic or experimental
tag in addition.

Note that the non-UTF-8-ness currently cannot be easily measured, but
the non-ASCII-ness can be using apt-file:
LC_ALL=C zgrep '[^[:print:]]' /var/cache/apt/apt-file/*_Contents-*.gz

Note that source packages may legitimately contain such sequences, for
example as part of test cases. Given that we have little control over
source packages, they should not be subject of such a check (at least
not with warning level).

Thanks for considering

Helmut


Reply to: