[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: request for Technical Committee ruling on Bug #109436



On Fri, Aug 24, 2001 at 06:49:01PM -0400, Raul Miller wrote:
> > So far:
> > 
> > Admin solution: tarballname,md5sum relationship is constant once
> > established.
> > 
> > Branden solution: tarballname,md5sum relationship varies based on
> > maintainers preference.

On Fri, Aug 24, 2001 at 07:27:36PM -0500, Branden Robinson wrote:
> Incorrect.
> 
> Branden solution: tarballname,md5sum varies if Debian Policy forces the
> maintainer to change it after it's been established.

Ok.

> > However, as I've already pointed out, the existing documentation could
> > easily be taken to imply such a requirement.
> 
> I don't believe this follows.

Are you willing to accept Wichert's statement about the implications
of your solution?

> > Right: you're saying, in essence, that the undocumented historic practices
> > of upstream represent an interface.
> > 
> > I'm asking: why should we consider these an interface rather a bug?
> > 
> > Your response is that you like it that way, and that's the way it's been.
> 
> No, I'm saying that I had no way of knowing a prori that the old
> interface was actually a bug.  It was the interface that was exposed to
> me as a developer.

What does it mean if you have two different md5sums for the same
file name?  You've been relying on an undocumented practice for
your interpretation.

> You say it's a bug, I say it's an interface. Because the documentation
> was unclear, there may be no way to objectively decide the matter.

I think we can agree that it's not a documented interface.

> Appealing to the authority of the archive maintainers, past or
> present, doesn't do any good if they didn't communicate their
> unenforced requirements to the users of the archive.

I agree that more precise communication could have happened sooner.
Unfortunately, we're limited to the present for what we do now.

> Consider this analog:
> 
> master.h:
> int ftp_interface(int value);
> 
> master.c:
> int ftp_interface(int value) {
> #ifdef NEW_VERSION
>   if (value < 0 ) {
>     fprintf(stderr, "Sorry man, that's crap.\n");
>     return 0;
>   }
> #endif
>   return 1;
> }
> 
> At some point, the builders of this "master" library #define
> NEW_VERSION, but the library's documentation about negative values was
> never documented anyplace.

What does it mean to have two official md5sums for the same file?

[Your psuedo code doesn't address this issue.]

> > But, I'm still willing to be convinced, if you're capable of doing so.
> > You can start by telling me what's wrong with "xfree86-sources", besides
> > "it's ugly, it's different, it's not what I want".
> 
> Debian User: "Why aren't all the other packages called <foo>-source?  I
> see this pine-src thing, is that the same thing?  Do I have to build
> XFree86 from scratch on my box just line pine?"

Not many users would confuse an orig.tar.gz with a deb, but let's
grant you this.

What do you think about: xfree86-clean?

> > The implementation was broken, because we had files that didn't match
> > their md5sums.  That bug has been fixed.  You've been relying on 
> > undocumented behavior.
> 
> How was I to know a priori that the behavior wasn't deliberate?  The
> only documentation I had were the Policy Manual, the Packaging Manual,
> and the Developers' Reference.
> 
> I've already explained how D.2.12 doesn't tell you can't upload a new
> Debian revision of a package with a new original source archive.  D.2.12
> doesn't address when .orig.tar.gz files should or should not be
> uploaded, just how to manage your .dsc and .changes files when you do or
> do not include .orig.tar.gz's.  C.3 doesn't address this either.

It doesn't, but it does imply that there's an issue here, since it talks
about the unique identity of a file.

> > The package maintainer determines what contents go in what file name,
> 
> No, actually Debian Policy decides a great deal of this automatically,
> as does C.3.

That's not a contradiction.

> In fact, it is *because* the Policy Manual is telling me what I can
> and cannot do that this change to the .orig.tar.gz was made in the
> first place.

Sure, and if you want some help seeing problems with our current policy,
consider http://cr.yp.to/compatibility.html ... of course, that point
of view assumes that the upstream authors are the package maintainers,
but hey, that's not such a bad idea, is it?

That said: if we do it your way, we break the release of all software
which has been built against 4.1.0.  If we do it the admin way we break
almost nothing or nothing.

> > but the package maintainer can't change their mind once the files
> > have been distributed (because at that point the package maintainer
> > no longer owns all instances of the file).
>
> I didn't change my mind. The Debian Policy Manual did it for me.

Take some responsibility for your own actions.

> > Once again: you're asking us to hide the problem.  If we do it your
> > way, we'll never be able to put up a page that says:
> > 
> > 	"Sorry, xfree86_4.1.0.orig.tar.gz contained code which
> > 	 we didn't have full rights to.  Please replace it with
> > 	 xfree86-sources_4.1.0.orig.tar.gz.  The code in question
> > 	 is not used in any debian binary packages."
> 
> Why do we need to?  We change the canonical version of
> xfree86_4.1.0.orig.tar.gz on the canonical site and tell the mirrors to
> mirror it.

And what about people with slow modems who only update occasionally?

Also, what about Wichert's point?

> If somebody tries to get "apt-get source xfree86" after this change
> happens, and they haven't apt-get updated yet, they might get told that
> the package is not available.

That's what happens if they try to do an apt-get source or apt-get
install after the package is purged but before the new one is in place.

After that: they wait for however many hours for the thing to download,
then get told that the download failed.

> If our mirrors are operating correctly, they will not have Sources files
> that do not correspond to the contents of their archive mirror.

You're assuming no integrity checks along the line of what katie does
at any of our mirrors.  That's not reasonable because there's huge race
condition windows while stuff is getting transfered, and if it's not
installed in an orderly fashion we expose the users to more problems
than are necessary.

[Ok, we have mirrors which are running without such checks, but that
doesn't mean it's right.]

> > Valid situations like .dsc and .orig.tar.gz file are transferred
> > at different times.
> > 
> > Do I have to define "race condition" for you and lecture on the subject?
> 
> I get 404's from mirrors all the time, on files that comply with archive
> maintainers' requirements to a T.  It's aggravating.  I don't know if
> they feel this desynchronization is a problem; I'd be thrilled to see it
> fixed.  The problem is not made substantially worse by accomodating my
> request.

I disagree.

> Either mirrors are up to date or they are not.  

Or they're being updated, such that all integrity checks are satisfied
at each step of the way.

> I am not advocating that ftp-master be put into a half-assed state
> to accomodate my source package. Let's please not beg the question
> by asserting that a changed .orig.tar.gz along with updates to katie
> database is to be defined as a half-assed state. Otherwise I truly am
> wasting my time here.

I think that's exactly what I'm saying.

> > But you've said:
> >
> >    I don't think the burden is on me to explicate the difficulties
> >    that the archive maintainers have in accomodating my request,
> >    though I have attempted to summarize to the best of my knowledge.
> >
> > You're right. The burden's on me. But I don't have a good answer.
>
> Incorrect. The burden's on the archive maintainers. Are you an archive
> maintainer? (I don't actually know who is, beyond James Troup, Ryan
> Murray, and Michael Beattie.)

If we tell them to change what they're doing, we're responsible for the
imlications of that decision.

> > > > Are you pretending that policy is the only relevant specification for
> > > > this context?
> > > 
> > > Can you quote any other specification that's on point?
> > 
> > You mean besides D.2.12?
> > 
> > I suppose documentation on what md5sum means is relevant.
> 
> No, it's not.  Please see above.  D.2.12 doesn't say a maintainer can't
> change the contents of an .orig.tar.gz without changing the filename.
> No piece of Debian documentation that I have been able to find,
> normative or descriptive, says this.

D.2.12 defines the Files: field, and specifies that an md5sum for
every file name.

Policy also says that a package upload can reuse source from an
earlier upload.

There's at least three interpretationss for this:

[1] Each released package has a unique source tarball, and if it's omitted
the source tarball from the previous release is copied into the package
directory.

	Here, file name doesn't matter, but we introduce lots of user
	problems because they can't put two different package in the
	same directory.  Also, we have a lot of problems with mirroring
	because mirroring is going to generate fresh copies of each
	source tarball [no symbolic or hard links in this context.]

[2] Each distinct source tarball is distinguished by its name.

	That's what the ftp admins have implemented.

[3] Later packages can overwrite an earlier package's source tarball.

	Here, we have md5sums which we can't rely on some of the time.

	This hits people who are downloading files manually, as well
	as automated software which requires proper md5sums for all
	dependent files before making a package available in the
	archive.

	That's what you seem to be advocating.

> > All instances of the information in the .dsc and .changes files from
> > 4.1.0-1 must be purged from all machines [in the entire world] which
> > store this information.
> 
> Pretty easy if we nuke 4.1.0-[12] from the archive. If the mirroring
> software is sound and there aren't network problems, it's a mirror
> site admin's responsibility to make sure that his mirror stays in sync
> with the reference site.

I don't think it's as easy as you say.

> For resources other than mirrors, I've explained why apt-get source
> doesn't pose a problem.

And I've explained why it does (modem users, conserving bandwidth,
wastes umpteen hours of download time -- remember that in much of the
world bandwidth is limited and expensive).

> > Again, this has to happen on all machines with debian sources before
> > you can safely propagate the new file.
>
> I don't see why, unless the mirrors refuse to accept the .orig.tar.gz
> on the reference site as canonical. As I said, I get 404's on .deb
> downloads with apt the time.

Does my sketch of an idea, above, shed any light here?

> > > You said I was violating Policy.  I pointed out that I wasn't.
> > 
> > You did no such thing.  You talked at length without bringing up
> > the issue we're talking about.
> 
> Then I guess you'll have to remind me.  These mails have gotten huge.

Discussion of D.2.12 -- you spent many paragraphs not talking about
the case of what it meant in the context of two different packages,
two different source packages, same file name.  But context is what this
discussion is all about.

... skipping ...

> If the Technical Committee will grant me immunity from prosecution for
> this violation of Policy, I'll go away a happy man, and fix the problem
> in 4.2.0.

For that, we'd need a lot more detail on the DFSG violation.  

Can you provide a pointer for the bad patch license?  Also, wasn't patch
written by Larry Wall?  You might just try asking him for permission to
distribute it under some DFSG copyright.

> > > Furthermore, as you are exhorting me to refrain from using
> > > hypotheticals or citing precedent in my argument, and stick to
> > > technical grounds, I'd ask you to refrain from arguing abstractions
> > > when concrete discussion of the archive maintenance software will
> > > suffice to clarify the present situation.
> > 
> > You've misunderstood what I wrote.  If you think that's what I said,
> > please quote back to me the relevant paragraphs.
> 
> "It's a hypothetical one.  It's not one for this committee unless
> other developers have problems with it."

You had asked if a problem was a technical problem.  I choose not to
answer that question, but to instead identify what I thought were the
salient points of what you were asking about.

> "we'll probably make our decision on technical grounds, and so far
> you've not given any for your proposal.  Instead, you've mostly been
> focusing on issues of precedent."

You've mixed up the relevant paragraphs here -- even chopping up a
sentence in the middle.  The actual quote was:

   Also, consider that if we do take this on, we'll probably make our
   decision on technical grounds, and so far you've not given any for
   your proposal.

   Instead, you've mostly been focusing on issues of precedent.  Precedent,
   to us, mostly matters in the context of questions like "what breaks?",
   "what does this enable?", and "what do the specs say?"

Note that I wasn't saying anything is wrong with precedent -- I was
saying that you weren't presenting good reasons for your proposal.

> Maybe "exhortations" is too strong a word.  I interpret the above to
> mean "these types of argument aren't convincing me, try something else".

Ok.

> If I expect to persuade you, it's a waste of my time to apply approaches
> that you've already told me you'll reject.

If you sat down and thought through the issues of archive integrity
over a distributed set of web archives, such that you could show how to
incrementally maintain archive integrity, then I'd be very interested
in what you had to say.

But if what you say shows that you've not even considered the potential
problems, and then you say [paraphrasing] "it's not my responsibility,
this is the way I've alway done it and I like it this way" ... well...
.. um..  I guess that class of approach doesn't seem to solve much
of anything.

> > > Where, indeed, did I even *mention* worst-case scenarios?  Why are you
> > > concerned about them?  Is it your preference that the archive
> > > maintainers be required to act manually *only* in worst-case scenarios?
> > 
> > Ideally: yes.
> 
> I conclude from the above that you feel it is unreasonable to expect the
> archive maintainers to do anything at all (aside from let the da-katie
> machine churn) when a license or copyright problem is found in an
> .orig.tar.gz.  If this is incorrect, under what specific circumstances
> *do* you expect them to act?  How quickly is a package maintainer
> supposed to react when informed of a license or copyright problem in a
> source package?  Is putting off the resolution until the next upstream
> release acceptable?

That's not an ideal situation, obviously.  Here, I expect them to flush
out the bad packages.  Depending on the violation, we might choose to wait
until a replacement is available (like we're doing with this dead version
of patch), or we might choose to break things left and right in a panic.

I'd like to at least read the copyright on this old patch before saying
much more on how slow it's reasonable to be here.  If you can provide a
pointer, I'll go look it up.  Otherwise, I'll need to find a few hundred
megs of free space and plow through the sources myself.

> > Personally: I'm not going to make a proposal before I've exhausted
> > the possibility that you're capable of providing us with some actual
> > meaningful information relevant to this topic.
> 
> I have provided you with a very great deal of information, and you seem
> to regard most of it as worthless.  So either I'm being evasive, you're
> not asking the right questions, or there is no information I can give
> you that you would regard as meaningful.

Or some similar class of misunderstanding -- agreed.

> > And, on the other hand, if the entire matter can be cleared up with a
> > couple of bug report filings, I'm not sure I see what we are deciding on.
> 
> I filed one against ftp.debian.org.  The maintainers didn't want to fix
> it.

Quite often, if an issue doesn't require a code change it still requires
a documentation change.  Sometimes it's helpful to point that out
explicitly.

Thanks,

-- 
Raul



Reply to: