Re: Copyright file granularity
Excerpts from Steve Langasek's message of 2015-11-13 10:51:11 -0800:
> On Fri, Nov 13, 2015 at 04:10:14PM +0000, Wookey wrote:
> > I've been helping package a load of stuff recently for Robot OS and in
> > checking the copyright files I've come up aginst the question of exactly
> > how much segmentation there should be in copyright files, and the answer
> > to that depends on what it is they are actually for?
> > Is it sufficient to specify what licence things are under, or do we
> > really want to split it up into every licence x copyright-holder, or
> > even every licence x copyright statement (i.e date + holder)?
> > Clearly we need to know what licence things are under, and that seems
> > to me to be the main purpose of the file. One can imagine
> > circumstances when some argument develops and we might need to care
> > about exactly _who_ owns the copyright on each file, but under normal
> > circumstances that simply doesn't matter. We just care if it was BSD
> > or GPL or Apache or whatever, not who actually contributed it under
> > those terms: that's part of the point of free-software licencing. It's
> > easy enough to go look at exactly which file is copyright who if need
> > be.
> It is important to list all copyright holders; this is not something that
> it's "easy enough" to look up in the source, because many of these free
> software licenses require that you reproduce the copyright statement
> whenever you distribute binaries.
> 2. Redistributions in binary form must reproduce the above copyright
> notice, this list of conditions and the following disclaimer in the
> documentation and/or other materials provided with the distribution.
> To avoid accidental failures to comply with such license terms, Debian
> policy requires that *all* packages include the copyright information.
> > However there are numerous copyright holders and files contributed on
> > various dates so I spent several hours making this copyright file:
> > https://sources.debian.net/src/ompl/1.0.0%2Bds2-1/debian/copyright/
> > with each copyright owner split out into a separate stanza.
> > Is there any real benefit in doing this? It's moderately accurate, but
> > what is the practical benefit over lumping all the BSD-3-clause
> > copyright holders together into one list?
> Do we need to list all copyright holders? Yes. Do we need the copyright to
> be listed at the granularity of individual source files? No.
> And we should have better tools to generate debian/copyright files - this
> shouldn't be an intensive manual process. Unfortunately the only tool I'm
> aware of that does this is coupled to cdbs.
> And if we want the debian/copyright file to be readable afterwards, the
> tools are going to need to make some smart decisions about grouping of
> copyright stanzas, and not just list one for each (license, copyright
> holders, copyright dates) tuple.
> As an example of what I mean by "smart" groupings, I offer up the
> debian/copyright of edk2, attached. Unfortunately, I had to craft this
> monster by hand; and there are no tools to validate that it remains
> correct after updating to new upstream releases.
> Having a machine-readable copyright format is only the first step. If I
> ever get a round tuit, my goals are:
> - a stand-alone tool that can generate a debian/copyright (with "smart"
> stanza grouping) from the output of licensecheck
> - a standard format for hinting this tool in the debian directory when the
> answers licensecheck detects by inspecting the source are inaccurate
> - a stand-alone tool that can compare any two machine-readable copyright
> files and a given source tree and tell you whether they are equivalent
> The last is key, because it gives us automation around making sure
> debian/copyright is accurate and stays accurate.
I originally used the tool in cdbs and then manually collapsed the
copyright years that made sense. I took days, and frankly, I doubt every
new version is being checked manually in this way, and I don't think
anybody expects Debian to do that. But automation would at least give us
some hope of shipping something useful, and only needing to do manual
work with the exceptions.
That said, it would also be great if we could rethink needing to list
all copyright holders when the license doesn't require it.