I've been helping package a load of stuff recently for Robot OS and in checking the copyright files I've come up aginst the question of exactly how much segmentation there should be in copyright files, and the answer to that depends on what it is they are actually for? Is it sufficient to specify what licence things are under, or do we really want to split it up into every licence x copyright-holder, or even every licence x copyright statement (i.e date + holder)? Clearly we need to know what licence things are under, and that seems to me to be the main purpose of the file. One can imagine circumstances when some argument develops and we might need to care about exactly _who_ owns the copyright on each file, but under normal circumstances that simply doesn't matter. We just care if it was BSD or GPL or Apache or whatever, not who actually contributed it under those terms: that's part of the point of free-software licencing. It's easy enough to go look at exactly which file is copyright who if need be. So, for an example of why this matters look at ompl (https://tracker.debian.org/pkg/ompl) The package is largely BSD-3-Clause, with a couple of files that are Apache-2.0 and Expat However there are numerous copyright holders and files contributed on various dates so I spent several hours making this copyright file: https://sources.debian.net/src/ompl/1.0.0%2Bds2-1/debian/copyright/ with each copyright owner split out into a separate stanza. Is there any real benefit in doing this? It's moderately accurate, but what is the practical benefit over lumping all the BSD-3-clause copyright holders together into one list? If the answer is 'it's more accurate' then shouldn't we be requiring a stanza for every different copyright statement (which in this package would split the Rice University and Willow Garage sections into another 10-15 stanzas with different dates/date-combinations). This plethora of stanzas would also make the file very hard to read which is why I didn't go that far. But then I thought about it and started to wonder what exactly are the 'splitting' criteria? The criteria that makes most sense to me is 'by licence'. I just uploaded rosdistro (https://tracker.debian.org/pkg/ros-rosdistro) and got a comment from the reviewing ftpmaster that combining the two different copyright holders for BSD-3-clause files into one stanza was not really right: https://sources.debian.net/src/ros-rosdistro/0.4.2-1/debian/copyright/ Files: * Copyright: 2012 Willow Garage, Inc. 2013-2014 Open Source Robotics Foundation License: BSD-3-clause This interpretation says that the copyright line is not a list of copyright owners having stuff under this licence in the package, but a statement of the copyright on all of those files. However if you follow that logic then shouldn't we be having separate stanzas for each statement: File: foo Copyright: 2012 Willow Garage, Inc. License: BSD-3-clause File: bar Copyright: 2008 Willow Garage, Inc. License: BSD-3-clause File: bar Copyright: 2010-2012 Willow Garage, Inc. License: BSD-3-clause and so on? Because if not then we are concatenating statements into stanzas: File: bar Copyright: 2008,2010-2012 Willow Garage, Inc. License: BSD-3-clause And if that's OK then why not concatenate owners too, to get: Files: * Copyright: 2012 Willow Garage, Inc. 2013-2014 Open Source Robotics Foundation License: BSD-3-clause Lots of little stanzas is more accurate, but provides a much less clear overview to the casual inspector. Which comes back to the question of what exactly is this file for (people or computers)? So, after thinking about this for a while I decided that it was not clear to me what best/acceptable practice is, and that it would be best to ask here. I can't see much real usefulness in splitting beyond licence-type, although where separate contributions are clear (e.g debian/*) then having a stanza for that is generally informative. But in a package like opml with loads of mixed-up files from various people and instritutions over several years, the separation into copyright-holders doesn't tell you much and is very laborious to produce. Do we always want this done? What logic is there for that that doesn't also imply splitting by date too? I'm happy to do whatever we agree is right, but at the moment this feels like pointless makework, so I'd like to understand what the ftpmasters/policy actually requires, and why, and what other people do about this. I hope I have adequately explained the issue, and await guidance. Wookey -- Principal hats: Linaro, Debian, Wookware, ARM http://wookware.org/
Attachment:
signature.asc
Description: Digital signature