[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: DEP5: non-DFSG repackaging documentation

On Tue, Sep 14, 2010 at 07:17:59PM -0700, Russ Allbery wrote:
Jonas Smedegaard <dr@jones.dk> writes:
On Tue, Sep 14, 2010 at 06:23:12PM -0700, Russ Allbery wrote:

It seems like overkill to me, but I guess I don't really care. But if the source is only URLs, then for some of my packages I either need to omit it or duplicate Homepage, since I don't use any tarball release from upstream and therefore have no URL to point to. I package a Git tag instead, for which there's no URL syntax.

Or I guess just include the URL of the upstream instructions on how to use Git.

In my opinion you should as accurately as possible point to the location of the upstream source, i.e. refer to the upstream Git URL if that is the one you base your packaging work on, and only to the toplevel Homepage of the upstream code project if no other more narrow URL for sources (e.g. if each new source release is in a different subfolder). Source URLs need not be http URLs, I believe.

For weird cases where there isn't a simple location for the sources (such as with a dead upstream with no active web site), I think a textual explanation is way more useful than any sort of URL. If Source requires URLs, I would just omit Source entirely in that case. It's already optional, so that seems like a supported approach.

I am fine with Source field permitting free-form text, as long as the scope of that field is limited to covering the question "where did upstream release the source that was the main basis for this package?"

But that field name also isn't an accurate representation of what's going on when the packaging is based on a Git tag. No manipulation is involved other than running git archive against a tag.

Point is _you_ ran that command, not upstream. So _you_ created the tarball that Debian redistributes, not upstream.

True, you did not edit any of the content of any individual files inside the tarball, but you did edit the _tarball_ content: You created all the timestamps in there, for example.

I think you're stretching here.  :)

Let me put it another way: I cannot easily (e.g. using md5 or SHAnnn checksum) verify data redistributed if repackaged, even if containing same files: the binary chunk called a tarball is no longer pristine.

Your more extreme example of not using tarball at all, but generating from Git has same issue: the binary result of running that "git archive" command will never be the same. It is irrelevant that the command is simple to invoke.

I don't like Source-Manipulation at all as a field name. I'd much prefer we use something else. My personal preference is for just making Source free-form, since I don't see any utility in having it be machine-parsable. Failing that, something neutral like Source-Details or Source-Description would work better for me.

I suspect that I still didn't explain my concern clearly enough:

What bothers me is if a single non-machine-readable field covers both "where did upstream release the source that was the main basis for this package?" and "what manipulation was done to upstream source, if any".

The reason this bothers me is that we then cannot extract a list of source packages (voluntarily using DEP-5) for which the redistributed source is not upstream pristine source.

I believe that the Policy requirement of documenting stripped data is to document when we "distort" upstream source in our redistribution of it. It might be stretching it to be nitpicking about datestamps, but it has concrete affects on ability to verify authenticity.

I find it relevant that we document when what we redistribute is not what upstream distributes.

Sure, I agree with that.

Ok.  But need not be machine-readable in your opinion?

In your case upstream distributes Git data which we do not (yet) support redistributing.

In the case that I'm thinking of, upstream also distributes tarballs; I just choose not to use them for a variety of reasons.

Now you get me curious ;-)

So it makes sense to me that you document that for our users. Optionally - it is not mandatory to document that.

It's mandatory to document the origin of the upstream software.

I'm not disagreeing with you about the documentation requirements. I just don't like Source-Manipulation as a field name and don't see much point in requiring the Source field be URLs.

The naming is not important to me. What is important to me is somehow being able to machine-readable indicate that "yes, I admit that I am redistributing this source package differently than how I got it from upstream."

In the end, I suppose this is all bikeshed painting and I'll use the field name no matter what it's called, though.

I feel it is not, but if you judge this as nitpicking, I shall stop.

 - Jonas

 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

Attachment: signature.asc
Description: Digital signature

Reply to: