Deprecation of non-normalized Dpkg::Vendor::<vendor> modules
Some days ago Niels Thykier pointed out that the handling of origin
files and vendor modules was not consistent. When looking into this
for the Dpkg::Vendor::<vendor> loading code, I realized it was not
properly handling vendor names with problematic special characters
such as [\s:;/], and that it was not capitalizing it to conform to
the existing perl module naming convention. The origin file handling
was also only mapping the _first_ space into a «-».
So I've added code to remap anything that is not alphanumeric into
«-» for origin files, and to use them as word separators for module
names and capitalization points. And then added deprecation warnings
for the origin filenames and vendor modules that contain
non-alphanumeric characters, and for vendor modules also for names
starting with lower-case letters. I've updated the documentation to
make these clear in git HEAD.
I've gone through the derivatives census, and it's not clear from
there what derivatives have a Dpkg::Vendor module, but the change
seemed safe given the currently listed vendor names. And in any case
there's going to be a transition period.
If this is going to cause some issues, please let me know and we can
talk about possible solutions/alternatives or something. Even though
the easy way out would be to provide both module names.
But there are still some vague handling as there are multiple casing
tries for file and module lookups (lower-cased, as-is, lower-cased
then capitalized, capitalized), which can make this overly confusing.
I'm pondering whether to restrict these names further to have extremely
clear rules, although I'm afraid of this potentially causing issues? A
simple rule would be that a vendor name can only contain alphanumeric
characters in any casing, and dashes. No spaces or other special
characters allowed, as that can also be problematic on say debian/rules
when using $(filter ...). For origin filenames these would be mapped to
lowercase, for the perl module name dashes would mark capitalization
boundaries and then removed, so we'd have:
Vendor origin-filename vendor-module
------ --------------- -------------
Some-Vendor-OS some-vendor-os SomeVendorOs
SOME-vendor-os some-vendor-os SomeVendorOs
SomeVendorOS somevendoros Somevendoros
My worry with the above proposal of further tightening the rules, is
that this might affect code expecting specific vendor names, so it
might not be feasible or desirable even with a transition period.
A more lenient but still safe rule could be to allow other
non-alphanumeric characters (say such as [-.:;/%]) as separators except
for spaces, which would follow the same rule as the aforementioned «-»
one. Otherwise the casing rules could still apply to both the
origin-filename and the vendor-module, as that would imply just those
two files, and no further interface fallout. Let me know what you think!