Re: making debian/copyright machine-interpretable
* Sam Hocevar:
> On Sat, Aug 04, 2007, Florian Weimer wrote:
>> It's probably better to use a separate file. If there's a syntax
>> error, you can't be sure if the file is in the old format, or if its a
>> genuine error.
> But the information must be in debian/copyright.
Why? I don't think this has to be the case for the machine-readable
> Duplicating it is not an option.
Your proposal is all about duplicating data.
>> Copyright statements with year numbers need to be updated once per
>> year, complicating merging from upstream. Is this really worth the
>> effort? Copyright holder information is probably not very valuable
>> without unique identifiers per copyright holder anyway.
> This information is required for debian/copyright, too.
Policy doesn't say so, it seems.
> The proposal just puts it in a header.
It's impractical to list all copyright holders. For instance, the FSF
is sometimes sloppy with its paperwork and accepts patches without
copyright assignments. If there was a rule "all copyright holders
must be listed", we'd need access to the FSF's secret records to
compile the list.
And if I submit a patch, there's no way to tell easily if it's
copyright by me or by Enyo A/S.
> Citing copyright years is not useful, but it's probably required by
Over here, law requires that you name authors, not copyright holders
(or refrain from naming them, if that's what they demand). We should
better ignore that law. 8-/
>> In order to automatically detect licensing violations, the files in
>> the binary package would need annotations. Annotating the source
>> files is not sufficient.
> That's right, we don't know the licensing terms of binary files.
> But if we stop at the "it's not sufficient" argument, we'll never get
> anywhere, because it is impossible for a source package to determine the
> exact licensing terms of its binary packages.
Something involving ELF sections and magic headers (for scripts) could
do the job.
My concern is that we actually need the binary package thing, and once
we have that, we've got the data in about five places:
- source file headers
- upstream README/license file
- debian/copyright, free-form text
- debian/copyright (or wherever else), machine-readable format
- metadata for generating the binary package data
This is a bit excessive.