[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: The copyright checking question

One of the things that attracts me to Debian the most is it's consistency. We
all tend to follow a standardized policy and have tools to make sure the policy
is followed. This means we've collectively built a system that we all
(generally) agree on.

The benefit to this is that I'm still able to run almost (32 of 34) every single
system of mine without SysD. I may occasionally need to copy/paste/tweak an init
script, but that's it. (... he said, before looking at the last GR results)

These standards, gradual changes, etc. also mean it's /usually/ extremely easy
and safe to do upgrades between major Debian releases.

On Tue, 31 Dec 2019 23:39:32 -0600
John Goerzen <jgoerzen@complete.org> wrote:

> On Mon, Dec 30 2019, Steven Robbins wrote:
> > In another thread, Russ Allbery makes a salient observation [1]:
> >
> >     Requiring ftpmasters to [review debian/copyright before accepting an  
> >     upload]  is a choice that Debian has made.  Maybe it's the right choice,
> >     but other choices exist, and other entities make different choices.

Another major thing I like about Debian is it's review process. I know it's
currently a pain point, but having the review step means that we can be
reasonably sure what software on our systems is legally free.

Sure, in practice, that hasn't been a problem, but let's imagine/pretend evil
corporations exist... Microsoft decides Gitea is proving too much competition
to Gitlab. They take a look through the source and find a number of unlicensed
modules and offer the original creator(s) some money for them. They now have the
ability to stick their own license on it, call it their own, and sue anyone
using that module, which would include Gitea and Gogs.

I spent >50hr/wk for over half a year attempting to get Gitea packaged for
inclusion in Debian main. Working through that showed me just how often
incorrectly licensed or unlicensed work winds up in larger projects. I also
found that code copying is very common, making it nearly impossible to update
or track.

[Fortunately, a majority of project maintainers are more than happy to find out
their library was useful and quickly add an open source license. A few don't,
but the majority do.]

Here's an example of what sort of headache that can look like:
- https://github.com/go-gitea/gitea/pull/2241/files
- https://github.com/go-gitea/gitea/pull/2383/files

When I install something from Debian main, I get to have a reasonable level of
confidence that I can't wind up a victim of this sort of problem. As someone
who's been involved in copyright conflicts, this is something that is extremely
valuable to me.

> > Russ goes on to list several alternative strategies, concluding with
> >
> >     But I do think it's worth occasionally explicitly asking the question and
> >     then making an intentional choice, rather than assuming we're obligated to
> >     continue doing what we're doing.
> >
> > I agree.  This seems to me something worthy of a general resolution on, since 
> > these discussions pop up every 6-12 months and don't go anywhere.  I don't 
> > have the stamina to go through with proposing a resolution but I hope others 
> > will.  

I don't think this copyright/license/qa check is a "legally
responsible" /requirement/. In some ways, we're acting like a web hosting
company, but for packages. We have reasonable efforts (policies, auto-reject,
peer-review, etc.) to keep "bad" packages out and if we're made aware of
something that's problematic, we can remove it at the request of any cease &

This extra step probably doesn't protect the distribution, but it /does/
protect it's users and it protects the people uploading the packages.

> I'm not sure about the GR, but these are exactly the sorts of questions
> and ideas that need to be raised and discussed.

I'd prefer hold off on a GR.

> It is particularly odd to me that there is absolutely no review required
> to introduce code to Debian in general (particularly if a package is
> already in sid), but we have this one-time thing.  I agree with Russ;
> maybe it's needed, but it's high time we give it careful consideration
> again.

I think the cause-effect relationship might be backward here. From my
observation, the number one reason we don't do license checks with updates is
because we don't have the capacity for just what's handled in NEW, and part of
that is related to burn-out.

Sure, we could do away with this amazing bit of peer-review, but I strongly
believe that the cost is too great.

... We might be known as the bickering distro, but we bicker because we care
and because we let passion fuel our emotions.

The options I see:
1) Stop doing this review step
   ^ I reviewed an upload today that had extra DLL files and the entire .git/
   ^ Ask me about Gitea...
2) Get more DDs volunteering their time to be on the team
   ^ yeah... not gonna happen, especially not with the learning curve
3) Improve the tools so uploads can be reviewed more efficiently
   3a) Decreasing review burden and time-block commitment
   3b) Perhaps making it more exciting to be on the team?
4) Keep doing what we're doing
   ^ and keep getting what we're getting

Personally, I like #3. I've been trying to think of ways to improve the process
since the day I became a Trainee. Lately, I've found the motivation to start
turning ideas into code, partly fueled by Mo Zhou's discussion. I started
talking to him about my plans to see where our thoughts overlap.

For the moment, I think I'd prefer avoid public discussion of the plan, just to
prevent lots of wasted discussion before getting it out the door. If you'd like
to discuss it because you're interested in contributing source, then I'd love
to chat. A good front-end web dev would be extremely helpful, and someone that
understands dak very well.

As for the follow-up checks, well... I /might/ have a plan.
The automated tool I'm working on is going to provide a summary of certain
scans, including preliminary license findings. I picture a future where at least
a few of these checks can become part of a dak auto-reject rule, once they've
been fine-tuned a bit. This would introduce a way to verify that a package is
still very likely to be legally free.

I apologize for the lack of details, but I'm digging in and have some down time
from work. I'd like to make use of that time to just build. :)

Michael Lustfield

Reply to: