[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] proposed v3 source format using .git.tar.gz

On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
> I've been working on making dpkg-source support a new source package
> format based upon git. The idea is that a source package has only a
> .dsc and a .git.tar.gz, which is just a git repo.

So, I can't stand git's user interface. I generally try to avoid making
a huge issue of this since it seems to be massively political on places
like Planet at the moment, there seems to be a certain amount of
confusion of people's personal opinions with that of their employers
going on, and in any case I normally find that revision control
flamewars have negative utility. (I don't think it's terribly relevant
to this discussion why I prefer not to use git, and I don't want to
sidetrack the thread with that; I just wanted to present an existence
case of somebody who doesn't want to switch to .git.tar.gz and yet
doesn't want to stay with .orig.tar.gz and .diff.gz forever.)

Still, this work looks pretty cool, and I'd like to be able to make use
of it despite avoiding git whenever I can. I noticed that you'd
helpfully structured your changes such that it would be possible to plug
in a different revision control system, so I wrote a module to support
bzr. The patch is attached to this e-mail, and I'd appreciate comments;
if this work is merged into dpkg I'd be very happy if my addition were
merged too. There are probably some improvements to be made, but it was
really utterly trivial; I was impressed that I didn't have to touch
anything else beyond plugging in a new module. Ironically, of course, I
did use git to create it. :-)

While working on this I was thinking about general issues with the
format. It seems to me that it's suboptimal not to ship a working tree.
I know you sort of address this in the wiki FAQ, and I realise that
there are space advantages to only shipping the VCS data. However, I'd
like to try to persuade you otherwise if I can. My concerns are:

  * Users will need to have the VCS installed in order to inspect the

    It's true that this is no worse than dbs or dpatch or whatever, and
    in fact it's better because dpkg-source will take care of the
    unpacking step automatically. Still, I do think it is a downside; we
    do still ship /usr/share/doc/debian/source-unpack.txt, and people do
    unpack Debian source packages on other systems from time to time and
    inspect them (I certainly do the same in the other direction with
    source RPMs, and curse their complexity). Plus, if the VCS fails to
    reconstitute a working tree for some unforeseen reason (maybe you
    have a broken installation of it, or maybe there was some version
    skew, or something else), then you're rather screwed. Tarballs are
    nice and simple and, assuming they were transferred accurately,
    hardly ever break in ways that make it impossible for you to extract
    the files.

  * Buildds will need to have the VCS installed in their base system.

    Possibly a minor concern since sbuild does the unpack in the base
    rather than in the chroot, but it's there nevertheless. Every
    derivative distribution that runs its own buildds will need to take
    care of this too.

  * Some source packages want to ship non-VCS-managed files.

    It's very common for source packages to include autogenerated
    objects like configure, Makefile.in, etc. Whether to check these
    into a VCS is a somewhat religious matter (as acknowledged by the
    gettext info documentation, for instance), and personally I lean
    towards checking them in (with a few exceptions) just because it
    makes it easier to see when they change and keep an eye out for
    oddities, but I know that a lot of developers prefer to keep these
    outside their VCS. Shipping a working tree would make it easier to
    handle cases like this.

There are two obvious modifications to Joey's proposal that would allow
shipping a working tree. The first is just to include the working tree
in the .$VCS.tar.gz object. This has the advantage of being trivial to
implement on top of the current code: the git module would need to do a
'git checkout' after copying the .git, and the bzr module just wouldn't
call 'bzr remove-tree'.

The second possibility seems to me to be more flexible, though, and
probably not all that hard to implement: build both a .tar.gz
(containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
-x' to unpack the tree given at least one of these. This would allow
various interesting possibilities such as:

  * Buildds could just fetch the .tar.gz; they have no need of the VCS
    data. Users who just want to inspect the current version of the
    source and not change it might want to do this too, using (say)
    'apt-get source --no-vcs package'.

  * Developers on slow connections could say 'apt-get source --vcs-only
    package' to fetch just the .$VCS.tar.gz, with the documented caveat
    that it would be just like checking the source out of a VCS in that
    you might have to recreate some autogenerated files.

  * Space-constrained mirrors could conceivably exclude the VCS data if
    they had to, though we probably wouldn't encourage this.

  * Derivative distributions who are slow to upgrade their dpkg-source
    could still interoperate to some degree.

  * Tools like mc, vim's tar plugin, or
    http://www.mirrorservice.org/sites/ftp.debian.org/debian/ could
    still be used straightforwardly and without modifications to look
    inside source packages on mirrors.

These seem to me to be non-trivial advantages that outweigh the space
costs of shipping around the working tree. I'd be willing to have a go
at implementing this once I've had a bit more sleep.

Does any of this make sense?


Colin Watson                                       [cjwatson@debian.org]
>From 7c83b8a1f088db80512182930bb9d6ed8dda5179 Mon Sep 17 00:00:00 2001
From: Colin Watson <cjwatson@debian.org>
Date: Sat, 6 Oct 2007 17:49:04 +0100
Subject: [PATCH] initial support for .bzr.tar.gz based on Joey's git work

 scripts/Dpkg/Source/VCS/bzr.pm |  132 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 132 insertions(+), 0 deletions(-)
 create mode 100644 scripts/Dpkg/Source/VCS/bzr.pm

diff --git a/scripts/Dpkg/Source/VCS/bzr.pm b/scripts/Dpkg/Source/VCS/bzr.pm
new file mode 100644
index 0000000..3589d4a
--- /dev/null
+++ b/scripts/Dpkg/Source/VCS/bzr.pm
@@ -0,0 +1,132 @@
+# bzr support for dpkg-source
+# Copyright © 2007 Colin Watson <cjwatson@debian.org>.
+# Based on Dpkg::Source::VCS::git, which is:
+# Copyright © 2007 Joey Hess <joeyh@debian.org>.
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# GNU General Public License for more details.
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+package Dpkg::Source::VCS::bzr;
+use strict;
+use warnings;
+use Cwd;
+use File::Find;
+use Dpkg;
+use Dpkg::Gettext;
+push (@INC, $dpkglibdir);
+require 'controllib.pl';
+sub import {
+	foreach my $dir (split(/:/, $ENV{PATH})) {
+		if (-x "$dir/bzr") {
+			return 1;
+		}
+	}
+	main::error(sprintf(_g("This source package can only be manipulated using bzr, which is not in the PATH.")));
+sub sanity_check {
+	my $srcdir=shift;
+	if (! -d "$srcdir/.bzr") {
+		main::error(sprintf(_g("source directory is not the top directory of a bzr repository (%s/.bzr not present), but Format bzr was specified"), $srcdir));
+	}
+	# Symlinks from .bzr to outside could cause unpack failures, or
+	# point to files they shouldn't, so check for and don't allow.
+	if (-l "$srcdir/.bzr") {
+		main::error(sprintf(_g("%s is a symlink"), "$srcdir/.bzr"));
+	}
+	my $abs_srcdir=Cwd::abs_path($srcdir);
+	find(sub {
+		if (-l $_) {
+			if (Cwd::abs_path(readlink($_)) !~ /^\Q$abs_srcdir\E(\/|$)/) {
+				main::error(sprintf(_g("%s is a symlink to outside %s"), $File::Find::name, $srcdir));
+			}
+		}
+	}, "$srcdir/.bzr");
+	return 1;
+# Called before a tarball is created, to prepare the tar directory.
+sub prep_tar {
+	my $srcdir=shift;
+	my $tardir=shift;
+	sanity_check($srcdir);
+	my $old_cwd=getcwd();
+	chdir($srcdir) ||
+		main::syserr(sprintf(_g("unable to chdir to `%s'"), $srcdir));
+	# Check for uncommitted files.
+	# To support dpkg-source -i, remove any ignored files from the
+	# output of bzr status.
+	open(BZR_STATUS, '-|', "bzr", "status") ||
+		main::subprocerr("bzr status");
+	my @files;
+	while (<BZR_STATUS>) {
+		chomp;
+		next unless s/^ +//;
+		if (! length $main::diff_ignore_regexp ||
+		    ! m/$main::diff_ignore_regexp/o) {
+			push @files, $_;
+		}
+	}
+	close(BZR_STATUS) || main::syserr("bzr status exited nonzero");
+	if (@files) {
+		main::error(sprintf(_g("uncommitted, not-ignored changes in working directory: %s"),
+		            join(" ", @files)));
+	}
+	chdir($old_cwd) ||
+		main::syserr(sprintf(_g("unable to chdir to `%s'"), $old_cwd));
+	system("bzr", "branch", $srcdir, $tardir);
+	$? && main::subprocerr("bzr branch $srcdir $tardir");
+	# Remove the working tree.
+	system("bzr", "remove-tree", $tardir);
+	# Some branch metadata files are unhelpful.
+	unlink("$tardir/.bzr/branch/branch-name",
+	       "$tardir/.bzr/branch/parent");
+	return 1;
+# Called after a tarball is unpacked, to check out the working copy.
+sub post_unpack_tar {
+	my $srcdir=shift;
+	sanity_check($srcdir);
+	my $old_cwd=getcwd();
+	chdir($srcdir) ||
+		main::syserr(sprintf(_g("unable to chdir to `%s'"), $srcdir));
+	# Reconstitute the working tree.
+	system("bzr", "checkout");
+	chdir($old_cwd) ||
+		main::syserr(sprintf(_g("unable to chdir to `%s'"), $old_cwd));
+	return 1;

Reply to: