[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: policy regarding redistributable binary files in upstream tarballs



On Fri, 2014-11-21 at 17:39 +0800, Paul Wise wrote:
> On Fri, Nov 21, 2014 at 5:25 PM, Matthias Urlichs wrote:
> 
> > These days, they might just push their repo to github and let its machinery
> > generate the tarballs, which TTBOMK aren't guaranteed to be 1:1 identical to
> > another tarball of the same commit that's downloaded a week later. Or a
> > year.
> 
> I tried downloading a tarball just now and got identical results. I
> guess they are just using git archive, which produces identical
> results for me too.
> 
> https://github.com/whohas/whohas/archive/0.29.tar.gz

It doesn't matter whether git supplies a tool that provides reproducible
tar balls.  If there was a target in debian/rules responsible for it
something like this would work:

pristine-source:
        rm -rf debian/pristine-source.tmp
        mkdir debian/pristine-source.tmp
	git clone http://... debian/pristine-source.tmp
        cd debian/pristine-source.tmp && \
          git checkout $(get commitish from debian/changelog somehow)
        dpkg-pristine-source --format=git pristine-source.tmp

The spec for dpkg-pristine-source is roughly:

    - Inputs: source directory(s) and their formats.
    - Outputs:
        > .orig.tar, and
        > hashes written to debian/pristine-source.hashes

Which is not what I said before, but this is WIP.

Now that I think about if dpkg-pristine-source is possibly an overkill.
Any repeatable process would do.  Even:

  find debian/pristine-source.tmp \
    -path debian/pristine-source.tmp/.git -prune ! -type d | \
    LANG=C sort >debian/x
  cpio -o -H ustar <debian/x | \
    gzip -9 >../$(sed 's/\(.*\) (\(.*\).*/\1_\2/;q' debian/changelog).orig.tar.gz
  rm -f debian/pristine-source.hashes
  xargs -d '\n' < debian/x cat | sha256sum | \
    sed s/-/URL/ >debian/pristine-source_sha256.hash

All of the above was written after a couple of glasses of wine and has
never been tested.  Regardless, I hope is demonstrates the point: it is
possible to compute a immutable hash if upstream provides a
reproduceable way to retrieve the same sources.  As far as I know every
SCM post CVS does. 

That was a statement of the obvious I guess.  But it show what I am
proposing is not pie-in-the-sky.  It's achievable, and not even that
hard.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: