Re: DEP-5 meta: New co-driver; current issues
Le Fri, Aug 13, 2010 at 12:45:30AM +1200, Lars Wirzenius a écrit :
> The effort to get a machine-readable format for debian/copyright
> has been going on for some years now. I think it is time to get it
> done. To help with this, I am joining Steve Langasek as a driver
> for DEP-5.
thank you for your interest in the DEP. I hope that this time we can gain
some momentum and finish it.
I would like to say in public that the work that you will start from is partly
done by me, and that I consider myself as a driver as much as you and Steve
are, because I have been in the facts driving this DEP for more than a year
now, and contributed many changes on the DEP draft itself. I have aksed the DPL
to solve the disagreement between Steve and me, but he is in [VAC], so it will
take time. Let's be gentlemen and work together. The core of my argument with
Steve is the obligatory use of bzr in a complex workflow together with bzr-svn.
What I want is that on the final documents, the names on the front page
includes the people who did the work, not only the people who approved it. I
think it is very important in a do-o-craty like Debian.
I am still concerned with the idea of dramatically increasing the traffic on
debian-project with the work on the DEP, so I will list pending issues in a
monolithic email for the moment.
It is necessary to let people add comments in debian/copyright. Some people
have asked for free-form comments and I think that it is a valid request.
Enclosing comments in a DEP-5 fields give extra work since for each line a
space needs to be added, with a dot if the line was empty. Also, it reduces the
complexity of the syntax, by having a way to insert comments that are out of
the scope of the parsers. A `Comment` field can be a useful complement, in the
case the goal is to provide extra information that is to be displayed with the
license. This can include statements like for instance “The authors request but
do not require that use of this software be cited in publications as…” Such
statements are often the result of the authors kindly relicensing their work to
remove non-DFSG-free clauses from their license, and in that example I think it
would be appropriate to keep them in debian/copyright. As example of free-form
comments that do not need a field, there is extracts of the correspondance with
the authors when some points need to be confirmed, and the traditional “On
Debian systems, the complete text of the … License can be found in
/usr/share/common-licenses…”, which can be inferred by the parsers themselves.
The “paragraph” format that is popular in Debian control files does not allow
the use of free comments. Also, in addition to indentation it requires empty
lines to be represented by a single dot. I can tell you by experience that it
is unfun and frustrating to go through long texts, for instance the Artistic
license version 2.0, and add the missing dots. Of course there are programmatic
ways to solve that, but adding requirements like this is adding barriers to the
adoption of the format, and at the end of the day, the small barriers add up in
a quite tall one (as you can already read from the other comments on this
I propose to use a simpler format, that is trivial to parse:
It is proposed to implement this proposal in a format that has
similarities with Debian control files. The main differences are:
- Plain comments are allowed and are not required to start with sharp (#) signs.
- Within multi-line field bodies, empty lines do not need to be symbolised with a dot.
- A line with multiple spaces does not end the machine-readable section.
### Specification of the format
Fields are logical elements composed of a field name, followed by a colon that
can be flanked by spaces, followed by a field body, and terminated by a line
- A field name is composed of printable characters, except colons.
- The field body is composed of any character. Leading spaces of the body are
ignored. To avoid problems with multi-line values, any line terminator must
be escaped by following it with a space. The line that contains that space is
called a continuation line.
- Lines that are not continuation lines and do not start a new field are plain
- Fields are grouped in paragraphs that are separated by empty lines. The
paragraphs are organised in a sequential order. Within a paragraph, the
fields are not ordered. If the same field appears more than once in the same
paragraph, their contents are added.
Here is a small example in this format, with free-form comments:
Name : X Solitaire
Contact : John Doe <email@example.com>
Source : ftp://example.com/games
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
On Debian systems the full text of the GNU General Public License can
be found in the ‘/usr/share/common-licenses/GPL-2’ file.
I do not think that they need to be parseable, and would be tempted to propose
that we use the preferred short name upstream. For instance, the FSF uses
“GPLv3”, so we could use the same.
On the other hand, it was noted by Don yesterday and by Steve in December that
other projects, in particular Fedora, also use short names. I think that it
important that we converge on a common set. I proposed in December to contact
Fedora, but did not get positive answers on debian-project. I volunteer again
to contact Fedora and the Linux Foundation as a DEP driver, to propose them
to use a common set.
Lastly, there are cases like for the ‘BSD’ that needs to answer to a question
first: If a work is not copyright of the Regents of the University of
California, and forbits the use of another names for endorsment or promotion,
can we call it the “BSD” licenses? My answer to this would be no, and it should
be clearly written in the DEP. This said, we could provide a formalised way to
indicate that a license is “similar to” the BSD or MIT licenses:
File globbign syntax
Here is what I think represents the broader consensus from previous discussion:
* **`Files`**: List of space-separated pathnames indicating files that have
the same licence. Question marks indicate any character and asterisks
indicate any string of characters. When this field is omitted in the first
paragraph containing a `License` field, its value will be assumed to be '*'.
If multiple `Files` declaratioun match the same file, then only the last match
This allows simple globbing and the field's contents to be pasted to xargs
regardless if it contains newlines or not.
I proposed in the past to make it free form and optional:
All fields are optional.
* **`Copyright`**: One or more free-form copyright statement(s) that apply to
the files matched by the above pattern.
Some licenses do not require to cite copyright statements verbatim.
For reasons that I explained a couple of times earlier, I think that extra
fields should not be required to be prefixed by ‘X-’
To my knowledge, you were the first to suggest this. I like this a lot.
The work we do on Debian is best to be forwardable upstream. I have
commited an unbranded version of the DEP, but this was reverted by
Steve, who felt it has to be discussed.
J. Nieder proposed an Overview field. I think the idea is good.
* **`Overview`**: Synthethic summary of the licencing of the package as a whole.
As an alternative, we could specify that a Comment field in the first paragraph
contains the overview.
I am running out of time, but that is already a couple of things to discuss.
Have a nice day,
Tsurumi, Kanagawa, Japan