Re: Sponsorship requirements and copyright files
On Fri, Mar 20, 2009 at 09:09:53AM +0000, Roger Leigh <firstname.lastname@example.org> wrote:
> Mike Hommey wrote:
>> On Thu, Mar 19, 2009 at 11:02:48PM -0700, Daniel Moerner wrote:
>>> On Thu, Mar 19, 2009 at 10:19 PM, Mike O'Connor <email@example.com> wrote:
>>>> To me, it seems like since one has to go through all of the source files
>>>> anyway, creating a list of copyright holders while you are doing it is a
>>>> trivial task. I don't see why making this list takes any time at all
>>>> really. Unless you are not actually looking at the code you upload,
>>>> which would worry me for other reasons as well.
>>> I agree. The thing that I like about creating packages with the
>>> wiki.d.o specification is that it forces you to actually examine the
>>> copyrights of all the parts of a new package, instead of just use a
>>> lazy link to /usr/share/common-licenses/foo. This is especially
>>> important for packages that have many different hidden scripts or
>>> architecture-independent libraries that might have different licenses.
>>> With the kind of copyright file generated by dh_make, it seems like
>>> new maintainers often ignore the risk of a package with a tainted,
>>> unredistributable license problem.
>>> In shorter words: I think something should be done about the copyright
>>> file to encourage developers to actually perform an audit of the
>>> license status of files in their packages before they upload. The
>>> current copyright template doesn't really encourage this; I like the
>>> machine-parseable system because it makes it easy to organize such an
>> Try doing that on iceweasel or xulrunner. Hint: there are about 30000
>> files and a real lot of copyright holders.
>> It's already a PITA with webkit, which is about 3000 files and quite a
>> lot of copyright holders (the copyright file, which I'm pretty sure is
>> not accurate is 809 lines and growing at each new release).
>> On top of listing copyright holders, I must say listing the individual
>> files for each license in the copyright file is also a major PITA.
> Given that copyrights are usually in a standard format, such as
> Copyright (\([cC]\)|©) Year[-Year] Name Email
> It shouldn't be too hard to write a tool to scan the whole source tree
> and spit out a completely generated summary of copyright holders. If
> this could be added to an existing tool, such as licensecheck, this
> would save everyone from reimplementing it in their package (I was
> considering doing this).
Licensecheck already checks that, though you have to give it an option
for that, but it fails to catch anything that doesn't match such pattern,
and I can tell you there are a lot... I invite you to take a look at a few
.cpp files from xulrunner or iceweasel, you'll see you won't get much with
your pattern, and that you can't reliably get these holders with a pattern.