DEP-5: Files field and filename patterns
The Files field needs to specify patterns on filenames. We need to
specify how to do that.
The spec draft currently has this text (plus some examples):
### Files
#### Format
The **`Files`** field contains a list of comma-separated patterns
Files: foo.c, bar.*, baz.[ch]
File names containing spaces or commas should be put within double
quotes. The backslash character is an escaping character, be it inside
or outside double quotes:
Files: "Program Files/*", manual[english].txt
#### Syntax
Patterns are handled as by the `find` utility's `-name` option. Patterns
containing a path separator ("/") are handled as by the `find` utility's
`-path` option.
[examples removed]
It is quite common for a work to have files with copyright held by
different parties and received under different licenses. To accommodate
this, **multiple paragraphs are allowed with different `Files`
declarations**.
However it makes for easier reading if the copyright file lists the
"main" license first: the one matching the "top level" of the work, with
others listed as exceptions. To allow this, the following precedence
rule applies for matching files: **If multiple `Files` declarations
match the same file, then only the last match counts.**
As a result, it is recommended for clarity that the paragraphs appear in
order from most general (e.g. `Files: *`) first, through to most
specific. In the following example, the file `getopt.c` matches both
`Files: *` and `Files: getopt.*`; only the last match counts, so
the file `getopt.c` has the license declaration `License: BSD`.
[example removed]
Russ suggested using the same patterns as git does in the .gitignore
file, since those are familiar to many people.
http://www.kernel.org/pub/software/scm/git/docs/gitignore.html
Some issues I would like to raise:
* Is comma-separation appropriate? I'd prefer space-separation myself.
What do those who write parsers think?
* Are find -name/-path globs a better idea than .gitignore?
* .gitignore has one pattern per line, which I think is inappropriate
for for DEP-5 debian/copyright files: we should allow "Files: *.c *.h",
I think.
* Are shell-style globs the right idea? Should we use Perl regular
expressions on the entire pathname instead?
* Is using multiple paragraphs for exceptions the right idea? Russ
suggests not, and I think I agree. .gitignore uses an exclamation mark
(!) as a prefix for logical not.
Any other issues related to filename patterns in the Files field?
Reply to: