[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: debian/copyright verbosity



On Tue, 14 Apr 2009 17:22:06 +1000
Ben Finney <ben+debian@benfinney.id.au> wrote:

> (the discussion seems to have some new wrinkles, so including
> ‘debian-devel’ again)

OK, but probably best to drop -mentors at this stage.

For the benefit of -devel, the original question relates to these
example files:

 Files: foo.c
 Copyright: 2006 Mr. X
 License: GPL2+

 Files: bar.c
 Copyright: 2008 Mr. X
 License: GPL2+

 Files: baz.c
 Copyright: 2005 Mr. Y
 License: GPL2+

For now, let's assume that these are the *only* .c files in the
relevant directory.

AFAICT it is perfectly acceptable for debian/copyright to collapse
those to:

>  Files: *.c
>  Copyright: 2006, 2008 Mr. X
>  Copyright: 2005 Mr. Y
>  License: GPL2+

There is no collapsing of the years - each year is described separately.

The copyright is retained and each file is listed in debian/copyright
under the correct licence.

> Neil Williams <codehelp@debian.org> writes:
> 
> > On Tue, 14 Apr 2009 09:45:19 +1000
> > Ben Finney <ben+debian@benfinney.id.au> wrote:
> > 
> > > Matthias Julius <lists@julius-net.net> writes:
> > > 
> > > > In the light of the recent discussion about debian/copyright on -devel
> > > > I am wondering how verbose it actually needs to be.  Given the
> > > > following files:
> > 
> > > > or even further to:
> > > > 
> > > >  Files: *.c
> > > >  Copyright: 2006, 2008 Mr. X
> > > >  Copyright: 2005 Mr. Y
> > > >  License: GPL2+
> > 
> > Is perfectly fine and very commonly used syntax across Debian main.
> >  
> > > Neither of these are true statements of the copyright; 
> > 
> > I cannot see how that is a sustainable position.
> 
> Well, to sustain that position, I need only demonstrate whether it's
> true or not. That position isn't about whether it's acceptable.

Hence the move away from -mentors where the question *was* what was
deemed acceptable for sponsorship.

> > It is perfectly acceptable to collate copyright statements for files
> > under the same licence. Every package does it to one degree or
> > another.
> 
> Non sequitur. Either it's a true claim about the copyright, or it isn't.
> If we want to say that it's acceptable for Debian copyright files to
> make untrue claims, that's a different matter.

I'm not saying that at all. It is not untrue that *a* .c file has the
Copyright stated. That is the meaning of the * wildcard - that elements
of the following statements match at least some of the matches to the
pattern. If you want precision, specify ? instead of * but that only
gets you precision to the level of files that have the same number of
characters in the filename.

You seem to be proposing an absurd exaggeration of wildcard semantics.

Does Files: *.c mean that everything below applies equally to all files
that match the pattern or does it mean that the statement includes a
summary of all files that match the pattern?

If I do ls *.c, the wildcard means that I want a summary of all files
that match the pattern, I do not intend ls to interrogate the files to
see if they are identical in content prior to testing the pattern.

* is "all matches, even if absent".

It does not confer anything in the reverse case, it does not mean that
the matching patterns have any relationship to each other other than
the pattern. Matching * simply means that the file is one of the set of
files that match the pattern - it does not follow that every statement
about those files applies equally to all matches, other than that they
all match the pattern.

> > > you have *altered* the copyright claim so that it now makes a false
> > > claim (e.g., you now state that ‘bar.c’ is “Copyright 2006 Mr. X”,
> > > which is contrary to what the original source claims).
> > 
> > It's not altering copyright at all, the copyright is as-is,
> 
> That's not what I said. I'm saying that the *claim* being made is
> different from what the original source claims; and, further, that the
> claim being made is incompatible with what the original source claims.

Not at all, debian/copyright is not about the claim, it is about the
summary of what claims exist. The claims themselves cannot be divorced
from the source code - if anyone needs more precise data, that person
must go to the source code anyway. There's no point expanding
debian/copyright to contain the comment section from every .c file in
the source code.

> > debian/copyright contains the copyright statement and the fact that
> > such statements apply to files in the source.
> 
> To the extent that the original source's copyright statement is
> prserved, you're right.
> 
> But that's not what I see going on in the hypothetical collations
> presented here

(These are far from hypothetical, hundreds of packages already use such
collations.)

>: the claim of copyright holders, and of the years of
> copyright, is being broadened to a false claim that *all* the listed
> holders hold copyright in *all* the listed files through *all* the
> listed years. That's not what the original source claimed, and is almost
> certainly not true except by blind chance.

You're reading something different into the * wildcard.

*.c does not mean that everything applies equally to every file, it
means that for the files that match the pattern, the following
copyright statements may apply - i.e. a summary. I don't see anything
wrong with that. 

> > Copyright statements are different to licence statements, there is no
> > need to uniquely separate every single copyright statement, there is
> > particularly no need to repeat copyright statements to cover all
> > possible permutations.
> 
> You seem to be advocating a position that, in the interest of brevity,
> there should be a loss of the mapping between copyright statements in
> ‘debian/copyright’ and the scope of those statement made in the original
> source. From that assumption, a pertinent question would be why to
> bother listing those copyright statements in ‘debian/copyright’ at all.
> Is that your position?

If the licence doesn't require Copyright statements in the binary
distribution, then, yes, I'd like to see a situation where
debian/copyright details the licences and does not have to list endless
copyright holders.

Conversely, you seem to be proposing that debian/copyright not only
lists every single copyright holder for the entire source code
(something that most large packages simply cannot hope to achieve) but
that each package also independently lists the copyright for each
individual source code file where the copyright does not have an
identical match elsewhere in debian/copyright. That is a unimaginable
workload for large packages - and a workload that increases with every
new upstream release. This is an intolerable requirement. Any logic
about package maintenance would have to be so convoluted that many
of our most active DD's would simply retire on the spot. (If you doubt
that, read back on the previous thread about copyright and large
packages.)

If you want to kill Debian, go ahead. Those who have a more realistic
grip on the reality of maintainer workloads may decide to create
Debian2 instead - somewhere where the logical approach to copyright
statements allows collation. (i.e. somewhere that actually supports the
valuable work done by maintainers of KDE and GNOME, the kernel, glibc,
gcc and dozens of other packages with large upstream teams.)

> > > I don't think it's acceptable to make false copyright claims in the
> > > ‘debian/copyright’ file.
> > 
> > Neither do I, but collating copyright statements is NOT the same thing
> > as making a false copyright statement!
> 
> I think I am beginning to see why you say that, but I disagree, for the
> following reasons.
> 
> It seems to me that the ‘debian/copyright’ file, when it says:
> 
>     Files: *.c, *.h
>     Copyright: 2008 Sam Bar
>     Copyright: 2004, 2006 Max Foo
>     License: GPL-2+
> 
> is saying: “All files in this package matching ‘*.c’ or ‘*.h’ are
> copyright 2004 and 2006 by Max Foo and copyright 2008 by Sam Bar”. If
> that's not what the original source claims (in the original poster's
> example, the original source seemed to have exactly one holder, for
> exactly one year, for each file), then I don't see a justification for
> claiming that in ‘debian/copyright’.

The detail of who claims what must always be left to the source code,
Debian does not have the resources to duplicate this information at
that level of precision in all packages. That would be insane.

Collation happens in all arenas - especially upstream. Take a look at
the About box of your email client - I'll confidently state that it
does not list every single copyright holder for that package. The only
packages that can do that are tiny apps like the ones in GPE - even
those have four or five copyright holders and not all of those hold
copyright on all the files in these tiny packages.

Do you have any concept of the amount of work that your claims about
debian/copyright would actually entail?

> You seem to be advocating that ‘debian/copyright’ is not, itself, making
> any specific copyright claims beyond “some combination of these
> holders, these years, and these files, have some set of copyright
> associations;

Absolutely.

> but this file isn't going to tell you what those
> associations actually are”. If this interpretation is correct, I
> certainly can't see how it follows from the existing requirements for
> that file. I also can't see how such a claim is useful enough to bother
> writing in the file.

It probably isn't worth putting more than the "main" copyright holders
in debian/copyright, I've always thought that. Packages with very
active upstream communities can have thousands of copyright holders,
there is absolutely no point naming every single one of those in
debian/copyright - let alone directly attributing the files with the
names and causing enormous workloads for maintainers.
 
> I think that side of the argument does have some basis, though, and a GR
> on this issue seems like it might be more worthwhile than I previously
> considered.

Please, no GR. I'd rather the number of seconds rises to 6Q or
something equal to the total number of valid votes in the most popular
GR so far, than have yet another GR on something as trivial as the
number of lines in debian/copyright. That would be a horrible abuse of
the GR process IMNSHO.

Would you prefer that debian/copyright becomes 20-30x larger across
hundreds of packages, that nearly all maintainers of large packages
retire and that Debian release cycle lengths migrate to decades (or
more likely cease altogether as your distro would have no kernel, no
glibc, no compiler and no GUI)?

There has to be some logic and some consideration for maintainer
workload and that means that debian/copyright must be manageable, it
must be human-readable, it must be a file that does not need wholesale
revision every single time a new upstream release is made and it must,
therefore, support collation and brevity.

There needs to be a balance here - there is no good reason for
debian/copyright to list thousands of email addresses. There is even
less reason for that list to be precisely and accurately broken down to
email addresses per source code file.

-- 


Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

Attachment: pgpvm13VracQR.pgp
Description: PGP signature


Reply to: