[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Enhancing 3.0 (git) source package format



Hi,

It seems the current format of the 3.0 (git) style source packages
could be improved somewhat.

Currently, dpkg-source seems to create a tarball of the .git tree
under the working directory (with some exclusions).

This leads to the following issues:

• Git keeps unreferenced objects around in the local repository for a
while in case you want to undo an accidental deletion of e.g. a commit
or a branch. These deleted objects end up in the generated source
tarball (I checked).

• Git is good at creating an efficiently packed archive of objects. By
just creating a tarball of objects in whatever order they appear under
.git/objects and gzipping it, you lose the advantage of having e.g.
various revisions of the same file next to each other when
compressing. In addition, Git does delta encoding between file
revisions.

• Extra files, such as COMMIT_EDITMSG, hooks and config are added to
the source package. It appears this may be intentional, since they
aren’t excluded by dpkg-source -b, but should they *really* go into
the source package? Whenever your real Git repository is cloned, they
aren’t retained anyway.

In addition, dpkg-source explicitly has to make the hooks
non-executable for security reasons. Might as well not include them in
the first place in my *humble* opinion.

My proposal:

When building a source package, have dpkg-source run something like

git for-each-ref --format='%(refname)' | \
  grep -Ev '^refs/(remotes/|stash$)' | \
  xargs -d'\n' git bundle create ../PKG_VER.git-bundle HEAD

and reference PKG_VER.git-bundle directly from the dsc file.

The ‘git bundle’ command takes a list of refs (basically branches and
tags) and creates a file that contains a list of the refs and an
efficiently packed archive of the objects the refs (recursively) point
to.

No unreferenced objects or anything extraneous is added to the pack.
It contains exactly the data you’d receive when cloning the original
repository.

When unpacking the archive, have dpkg-source run

git clone PKG_VER.git-bundle PKG_VER

If a maintainer receives changes made by a contributor in this source
package format (say, by downloading a NMU from the archive), she can
simply ‘git fetch’ from the git-bundle file itself to her existing
repository and use exactly the same workflow she’d use if merging from
the contributor’s public repository over the network.

One sample doesn’t make a statistic, but here’s a quick comparison
using something I’m packaging at the moment:

132K    miniupnpd_1.4-0~local.git-bundle
196K    miniupnpd_1.4-0~local.git.tar.bz2
192K    miniupnpd_1.4-0~local.git.tar.gz
192K    miniupnpd_1.4-0~local.git.tar.lzma

Regards,
-- 
Jοhan Kiviniemi  http://johan.kiviniemi.name/


Reply to: