[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Names of Fields in Control Files



Dear Debian dpkg Maintainers:

I believe that all control field names currently in use are restricted
to the ASCII character set.

Debian Policy currently specifies that the files are to be UTF-8
encoded, but does not mention whether any control field names could
be, in the future, encoded in anything other than plain 7-bit ASCII.

Russ Allbery mentioned:

22:02:40 < rra> jawnsy: I don't think we say that explicitly, but RFC
5322 requires it and I can't imagine ever not enforcing that.
Although you should check with the dpkg maintainers to be sure.

Could we/should we make the Debian Policy more restrictive, and
specify explicitly that field names must only be ASCII-encoded?

I am inexperienced here, but I believe that limiting field names to
ASCII might help prevent bugs whereby invisible Unicode-encoded
characters are preventing something like:

   strcmp("Build-Depends", "Build-Depends")

from matching (e.g. the - is a different hyphen, or there are
invisible characters, or something like that).

Your comments and feedback on this would be much appreciated.

Cheers,

Jonathan
(please Cc me on replies, I'm not subscribed to debian-dpkg. I think
it'd also be worthwhile to keep debian-policy in the loop on this)


Reply to: