Re: copyright precision
On Wed, 10 Aug 2016 at 11:12:55 +0800, Paul Wise wrote:
> The only possible way to solve this in general terms is, accurate
> document the copyright/license of the source package using the
> machine-readable format and during builds, track the transformation of
> input files in the source package to output files in the binary
> package and then generate the copyright/license information for the
> binary package based on which input files from which source/binary
> packages ended up in the new binary package.
I'm sure this is a very interesting academic exercise, but pragmatically,
why do we want to require ourselves to go to all that effort? For that
matter, is everything we require *now* necessary or desirable?
Broadly, we have two reasons (that I'm aware of) to do legal stuff:
because we want to (it meets some goal that we care about - self-imposed
policy), and because lawyers tell us we are at risk of being sued if we
don't (it meets our goal of being able to continue making Debian -
For software with a reasonably helpful upstream and a reasonably sane
build system, I've often found that jumping through the necessary
hoops to write debian/copyright takes about as long as the rest of the
packaging put together. This is demotivating: I didn't join this project
to copy copyright information around, I joined this project to make an
If I have to copy copyright information around to meet our project's
goals (including the goal of obeying copyright law, both because we
want to respect authors' rights and because we don't want to get sued)
then I'll put up with it - but I think we should be clear about why we
do this work, and only require maintainers to do exactly enough of it
to meet its actual goals.
In particular, if this thread comes to the conclusion that more needs to
be done than what maintainers currently do, then it should be something
actionable; and since it will likely create more work for a very large
number of people, it should be backed up by *why* that work is needed.
If the reason turns out to be a ftp-master saying "we have received
legal advice saying that you must do x, y and z, and we are not allowed to
explain further", then that would be unsatisfying but better than nothing;
and at least it would put a boundary on it.
Self-imposed policy of DFSG compliance:
One core value in Debian is that all of main is DFSG-compliant.
If we assume that maintainers (and ftpmasters) check the licenses of our
source packages as they are meant to do, and we build all of main's
binary packages from source code in main and other binaries from main,
then all of main that is distributable is trivially DFSG-compliant.
(Some of it might be non-distributable, for instance by being a derived
work of both OpenSSL and something GPL'd; but that's license-compliance,
and we frequently already detect it.)
Self-imposed policy of documenting copyright information:
says "Every package must be accompanied by a verbatim copy of its
copyright information and distribution license", which contains an
implicit assumption that each package *has* a single distribution license.
This is clearly not actually true in practice. The DEP-5 specification
addresses this by allowing the copyright file to specify multiple licenses
which must be complied with simultaneously. Optionally, it also lets the
maintainer specify the licenses of individual source files matched by
filename or glob.
The ftp-masters appear to have interpreted "copyright information" to
require a verbatim quotation of the license grant, except in cases
where there are several similar license grants with trivial differences.
I'm still not clear on why this is, and whether it's because we want to
or because we'll get sued if we don't.
Some of our Policy-compliant copyright files are clearly absurd;
adwaita-icon-theme's is 88K and lists at least over 200 (potential)
copyright holders, mostly for l10n. I find it hard to believe that all
of that is actually necessary or achieving a desirable goal for us.
Meanwhile, linux's copyright file resorts to citing "Linus Torvalds
and many others" as copyright holders. If the kernel was held to the
same standard that is (anecdotally) applied to most other packages,
its copyright file would presumably be impractically huge (or perhaps
more likely, we would no longer have any volunteers willing to either
maintain a Linux kernel package or review it in NEW).
DEP-5 notably omits any syntax for describing the copyright or licenses of
the contents of the *binary* package, which suggests that its authors
(even those who consider it most valuable to specify the licenses of
individual source files) did not consider this to be a goal.
Are we aiming to go further than this by documenting, for instance,
which specific DFSG-compliant license applies to /usr/bin/dbus-daemon,
which specific DFSG-compliant license applies to
/usr/share/doc/dbus-1-doc/html/api/jquery.js, and who their copyright
holders are? If so, why?
I'm not convinced that anyone in Debian has both the necessary legal
expertise (definition of a derivative work in arbitrary jurisdictions)
and the necessary technical expertise (tracing what goes into a binary)
to make a reliable statement about who has a copyright interest in,
for example, /usr/bin/dbus-daemon. I would hope that Debian does not
aim to set policies that mean it will only accept contributions from
copyright lawyers who also happen to be software engineering experts.
At the moment, the best we can do is to provide an incomplete list of
people who have claimed that they *might* have a copyright interest in
that binary; hopefully that's more than enough to achieve our goals.
License compliance in general:
One argument for quoting the copyright holders and license information
is that it's for license compliance, particularly compliance with the
I'm not sure to what extent this actually holds water: we are willing to
say that we satisfy the GPL's requirement to provide copies of the GPL,
and the source code, by pointing to the nearby copies of base-files.deb
(for the GPL) and the source package (for the GPL and the source
code). From a devil's-advocate point of view: can't we apply the same
reasoning to the copyright information and the license grant?
It is perhaps interesting to observe that Fedora, which is backed by a
well-funded US corporation (i.e. an attractive target for lawsuits),
limits itself to saying this about (for example) dbus:
# The effective license of the majority of the package, including the shared
# library, is "GPL-2+ or AFL-2.1". Certain utilities are "GPL-2+" only.
License: (GPLv2+ or AFL) and GPLv2+
whereas the corresponding Debian package has a 412-line copyright file.
Similarly, ikiwiki has:
# ikiwiki is licensed under GPLv2+, the Python code in plugins/ under
# BSD (2-clause)
License: GPLv2+ and BSD
in Fedora, and 386 lines in Debian.
License compliance in Doxygen's jquery.js specifically:
In the case of Doxygen's "jquery.js", if you look at the file itself
(for instance dbus-1-doc has a copy), you'll notice that it contains
copyright and license information for the libraries that went into it
(which is specifically preserved by the minification process). We have
not followed Debian's self-imposed requirement to document copyright
information centrally, but we have obeyed its (permissive) license.