[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: 3.0 (git), now with bundles



Attached updated patch adds the ability to limit the history depth that
is included in the git bundle (via --git-depth), as well as fully
control which tags/branches/objects are included/excluded in the bundle
(via --git-ref).

A new .gitshallow file is added to the source package to record info
git needs for shallow clones.


joey@gnu:~/src>cat ttyrec/debian/source/format 
3.0 (git)
joey@gnu:~/src>cat ttyrec/debian/source/options 
git-ref=debian
git-ref=mytag
joey@gnu:~/src>dpkg-source --git-depth=3 -b ttyrec
dpkg-source: info: using options from ttyrec/debian/source/options:
--git-ref=debian --git-ref=mytag
dpkg-source: info: using source format `3.0 (git)'
dpkg-source: info: creating shallow clone with depth 3
dpkg-source: info: bundling: debian mytag
Counting objects: 38, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (26/26), done.
Writing objects: 100% (38/38), 49.49 KiB, done.
Total 38 (delta 5), reused 35 (delta 4)
dpkg-source: info: building ttyrec in ttyrec_1.0.8-4.dsc
joey@gnu:~/src>cat ttyrec_1.0.8-4.gitshallow
2387fc0d9c95c8de1b9afc60897fb3573761a766
79b1f1aff6d21b6028a3948b223d200925e56fa1
c15672934bfa53a33d4c329c641d8d3943194715
d8366ac884b62a90e62e506016c5b1796b7de3d2
joey@gnu:~/src>dpkg-source -x ttyrec_1.0.8-4.dsc 
dpkg-source: warning: extracting unsigned source package (ttyrec_1.0.8-4.dsc)
dpkg-source: info: extracting ttyrec in ttyrec-1.0.8
dpkg-source: info: cloning ttyrec_1.0.8-4.git
dpkg-source: info: setting up shallow clone
joey@gnu:~/src>cd ttyrec-1.0.8 
joey@gnu:~/src/ttyrec-1.0.8>git log --pretty=oneline        
joey@gnu:~/src/ttyrec-1.0.8>git log mytag --pretty=oneline
c54af7a7a1797e41fabfa9625db9c67617d89bd3 update
7f73eb6ce9522d99ee108dd89794198c7f095d60 update
135bef2b5363925a2b1a444970b3f109e5bee5d8 add
c15672934bfa53a33d4c329c641d8d3943194715 add
joey@gnu:~/src/ttyrec-1.0.8>git log --pretty=oneline
807f48d0db54f314d4397340467d8de7976940de update
c54af7a7a1797e41fabfa9625db9c67617d89bd3 update
7f73eb6ce9522d99ee108dd89794198c7f095d60 update
135bef2b5363925a2b1a444970b3f109e5bee5d8 add
c15672934bfa53a33d4c329c641d8d3943194715 add

Note that in this example, 4 revs of mytag were stored, because git
clone --depth always stores one more rev than requested. 5 revs of the
debian branch were stored, because it shares history with mytag. The
debian/source/options caused all other tags and branches to be omitted,
so their history is not included.

-- 
see shy jo
From 1d1da3324366280b1cfd79bd0508377dccc3bbfb Mon Sep 17 00:00:00 2001
From: Joey Hess <joey@kitenet.net>
Date: Tue, 1 Jun 2010 16:01:35 -0400
Subject: [PATCH] modify 3.0 (git) to use git bundle

Much better than the old approach of a tarball of the .git repository,
the git bundle format is simple to understand and work with, and
doesn't need to be sanitized for security. Much code went away.

Supports limiting history depth by creating a shallow clone.
---
 man/dpkg-source.1                     |   44 +++++-
 scripts/Dpkg/Source/Package/V3/git.pm |  277 +++++++++++----------------------
 2 files changed, 131 insertions(+), 190 deletions(-)

diff --git a/man/dpkg-source.1 b/man/dpkg-source.1
index 5e4d30b..4a2c01d 100644
--- a/man/dpkg-source.1
+++ b/man/dpkg-source.1
@@ -531,13 +531,49 @@ in the current directory. At least one file must be given.
 The generated .dsc file will contain this value in its \fIFormat\fP field
 and not "3.0 (custom)".
 .
-.SS Format: 3.0 (git) and 3.0 (bzr)
-Those formats are experimental. They generate a single tarball
-containing the corresponding VCS repository.
+.SS Format: 3.0 (git)
+This format is experimental. It uses a bundle of a git repository to hold
+the source of a package.
 .PP
 .B Extracting
 .PP
-The tarball is unpacked and then the VCS is used to checkout the current
+The bundle is cloned to a new git repository.
+.PP
+Note that by default the new repository will have the same branch checked
+out that was checked out in the original source. (Typically "master", but
+it could be anything.) Any other branches will be available, under as
+`remotes/origin/`
+.PP
+.B Building
+.PP
+Before going any further, some checks are done to ensure that we
+don't have any non-ignored uncommitted changes.
+.PP
+\fBgit-bundle\fP(1) is used to generate a bundle of the git repository.
+By default, all branches and tags in the repository are included in the
+bundle.
+.PP
+.B Build options
+.TP
+.BI \-\-git\-ref= ref
+Allows specifying a git ref to include in the git bundle. Use disables
+the default behavior of including all branches and tags. May be specified
+multiple times. The \fIref\fP can be the name of a branch or tag to
+include. It may also be any parameter that can be passed to
+\fBgit-rev-list\fP(1). For example, to include only
+the master branch, use \-\-git\-ref=master. To include all tags and
+branches, except for the private branch, use \-\-git\-ref=\-\-all
+\-\-git\-ref=^private
+.BI \-\-git\-depth= number
+Creates a shallow clone with a history truncated to the specified number of
+revisions.
+.SS Format: 3.0 (bzr)
+This format is experimental. It generates a single tarball
+containing the bzr repository.
+.PP
+.B Extracting
+.PP
+The tarball is unpacked and then bzr is used to checkout the current
 branch.
 .PP
 .B Building
diff --git a/scripts/Dpkg/Source/Package/V3/git.pm b/scripts/Dpkg/Source/Package/V3/git.pm
index c8e3819..5fccc09 100644
--- a/scripts/Dpkg/Source/Package/V3/git.pm
+++ b/scripts/Dpkg/Source/Package/V3/git.pm
@@ -1,7 +1,7 @@
 #
 # git support for dpkg-source
 #
-# Copyright © 2007 Joey Hess <joeyh@debian.org>.
+# Copyright © 2007,2010 Joey Hess <joeyh@debian.org>.
 # Copyright © 2008 Frank Lichtenheld <djpig@debian.org>
 #
 # This program is free software; you can redistribute it and/or modify
@@ -22,20 +22,17 @@ package Dpkg::Source::Package::V3::git;
 use strict;
 use warnings;
 
-our $VERSION = "0.01";
+our $VERSION = "1.00";
 
 use base 'Dpkg::Source::Package';
 
-use Cwd;
+use Cwd qw(abs_path getcwd);
 use File::Basename;
-use File::Find;
 use File::Temp qw(tempdir);
 
 use Dpkg;
 use Dpkg::Gettext;
-use Dpkg::Compression;
 use Dpkg::ErrorHandling;
-use Dpkg::Source::Archive;
 use Dpkg::Exit;
 use Dpkg::Source::Functions qw(erasedir);
 
@@ -70,40 +67,21 @@ sub sanity_check {
               $srcdir);
     }
 
-    # Symlinks from .git to outside could cause unpack failures, or
-    # point to files they shouldn't, so check for and don't allow.
-    if (-l "$srcdir/.git") {
-        error(_g("%s is a symlink"), "$srcdir/.git");
-    }
-    my $abs_srcdir = Cwd::abs_path($srcdir);
-    find(sub {
-        if (-l $_) {
-            if (Cwd::abs_path(readlink($_)) !~ /^\Q$abs_srcdir\E(\/|$)/) {
-                error(_g("%s is a symlink to outside %s"),
-                      $File::Find::name, $srcdir);
-            }
-        }
-    }, "$srcdir/.git");
-
     return 1;
 }
 
-# Returns a hash of arrays of git config values.
-sub read_git_config {
-    my $file = shift;
-
-    my %ret;
-    open(GIT_CONFIG, '-|', "git", "config", "--file", $file, "--null", "-l") ||
-        subprocerr("git config");
-    local $/ = "\0";
-    while (<GIT_CONFIG>) {
-        chomp;
-        my ($key, $value) = split(/\n/, $_, 2);
-        push @{$ret{$key}}, $value;
+sub parse_cmdline_option {
+    my ($self, $opt) = @_;
+    return 1 if $self->SUPER::parse_cmdline_option($opt);
+    if ($opt =~ /^--git-ref=(.*)$/) {
+        push @{$self->{'options'}{'git-ref'}}, $1;
+        return 1;
     }
-    close(GIT_CONFIG) || syserr(_g("git config exited nonzero"));
-
-    return \%ret;
+    if ($opt =~ /^--git-depth=(\d+)$/) {
+        $self->{'options'}{'git-depth'} = $1;
+        return 1;
+    }
+    return 0;
 }
 
 sub can_build {
@@ -113,24 +91,11 @@ sub can_build {
 
 sub do_build {
     my ($self, $dir) = @_;
-    my @argv = @{$self->{'options'}{'ARGV'}};
-#TODO: warn here?
-#    my @tar_ignore = map { "--exclude=$_" } @{$self->{'options'}{'tar_ignore'}};
     my $diff_ignore_regexp = $self->{'options'}{'diff_ignore_regexp'};
 
     $dir =~ s{/+$}{}; # Strip trailing /
     my ($dirname, $updir) = fileparse($dir);
-
-    if (scalar(@argv)) {
-        usageerr(_g("-b takes only one parameter with format `%s'"),
-                 $self->{'fields'}{'Format'});
-    }
-
-    my $sourcepackage = $self->{'fields'}{'Source'};
     my $basenamerev = $self->get_basename(1);
-    my $basename = $self->get_basename();
-    my $basedirname = $basename;
-    $basedirname =~ s/_/-/;
 
     sanity_check($dir);
 
@@ -170,166 +135,106 @@ sub do_build {
 	      join(" ", @files));
     }
 
-    # git clone isn't used to copy the repo because the it might be an
-    # unclonable shallow copy.
-    chdir($old_cwd) ||
-	syserr(_g("unable to chdir to `%s'"), $old_cwd);
-
-    my $tmp = tempdir("$dirname.git.XXXXXX", DIR => $updir);
-    push @Dpkg::Exit::handlers, sub { erasedir($tmp) };
-    my $tardir = "$tmp/$dirname";
-    mkdir($tardir) ||
-	syserr(_g("cannot create directory %s"), $tardir);
-
-    system("cp", "-a", "$dir/.git", $tardir);
-    $? && subprocerr("cp -a $dir/.git $tardir");
-    chdir($tardir) ||
-	syserr(_g("unable to chdir to `%s'"), $tardir);
-
-    # TODO support for creating a shallow clone for those cases where
-    # uploading the whole repo history is not desired
-
-    # Clean up the new repo to save space.
-    # First, delete the whole reflog, which is not needed in a
-    # distributed source package.
-    system("rm", "-rf", ".git/logs");
-    $? && subprocerr("rm -rf .git/logs");
-    system("git", "gc", "--prune");
-    $? && subprocerr("git gc --prune");
-
-    # .git/gitweb is created and used by git instaweb and should not be
-    # transferwed by source package.
-    system("rm", "-rf", ".git/gitweb");
-    $? && subprocerr("rm -rf .git/gitweb");
-
-    # As an optimisation, remove the index. It will be recreated by git
-    # reset during unpack. It's probably small, but you never know, this
-    # might save a lot of space. (Also, the index file may not be
-    # portable.)
-    unlink(".git/index"); # error intentionally ignored
-
+    # If a depth was specified, need to create a shallow clone and
+    # bundle that.
+    my $tmp;
+    my $shallowfile;
+    if ($self->{'options'}{'git-depth'}) {
+    	chdir($old_cwd) ||
+		syserr(_g("unable to chdir to `%s'"), $old_cwd);
+	$tmp = tempdir("$dirname.git.XXXXXX", DIR => $updir);
+	push @Dpkg::Exit::handlers, sub { erasedir($tmp) };
+	my $clone_dir="$tmp/repo.git";
+	# file:// is needed to avoid local cloning, which does not
+	# create a shallow clone.
+    	info(_g("creating shallow clone with depth %s"),
+		$self->{'options'}{'git-depth'});
+        system("git", "clone", "--depth=".$self->{'options'}{'git-depth'},
+		"--quiet", "--bare", "file://".abs_path($dir), $clone_dir);
+   	$? && subprocerr("git clone");
+    	chdir($clone_dir) ||
+		syserr(_g("unable to chdir to `%s'"), $clone_dir);
+        $shallowfile = "$basenamerev.gitshallow";
+        system("cp", "-f", "shallow", $old_cwd."/".$shallowfile);
+   	$? && subprocerr("cp shallow");
+    }
+
+    # Create the git bundle.
+    my $bundlefile = "$basenamerev.git";
+    my @bundle_arg=$self->{'options'}{'git-ref'} ?
+    	(@{$self->{'options'}{'git-ref'}}) : "--all";
+    info(_g("bundling: %s"), join(" ", @bundle_arg));
+    system("git", "bundle", "create", $old_cwd."/".$bundlefile,
+	    @bundle_arg,
+	    "HEAD", # ensure HEAD is included no matter what
+	    "--", # avoids ambiguity error when referring to eg, a debian branch
+    );
+    $? && subprocerr("git bundle");
+    
     chdir($old_cwd) ||
 	syserr(_g("unable to chdir to `%s'"), $old_cwd);
 
-    # Create the tar file
-    my $debianfile = "$basenamerev.git.tar." . $self->{'options'}{'comp_ext'};
-    info(_g("building %s in %s"),
-	 $sourcepackage, $debianfile);
-    my $tar = Dpkg::Source::Archive->new(filename => $debianfile,
-					 compression => $self->{'options'}{'compression'},
-					 compression_level => $self->{'options'}{'comp_level'});
-    $tar->create('chdir' => $tmp);
-    $tar->add_directory($dirname);
-    $tar->finish();
-
-    erasedir($tmp);
-    pop @Dpkg::Exit::handlers;
+    if (defined $tmp) {
+	erasedir($tmp);
+	pop @Dpkg::Exit::handlers;
+    }
 
-    $self->add_file($debianfile);
+    $self->add_file($bundlefile);
+    if (defined $shallowfile) {
+    	$self->add_file($shallowfile);
+    }
 }
 
-# Called after a tarball is unpacked, to check out the working copy.
 sub do_extract {
     my ($self, $newdirectory) = @_;
     my $fields = $self->{'fields'};
 
     my $dscdir = $self->{'basedir'};
-
-    my $basename = $self->get_basename();
     my $basenamerev = $self->get_basename(1);
 
     my @files = $self->get_files();
-    if (@files > 1) {
-	error(_g("format v3.0 uses only one source file"));
-    }
-    my $tarfile = $files[0];
-    if ($tarfile !~ /^\Q$basenamerev\E\.git\.tar\.$compression_re_file_ext$/) {
-	error(_g("expected %s, got %s"),
-	      "$basenamerev.git.tar.$compression_re_file_ext", $tarfile);
-    }
-
-    erasedir($newdirectory);
-
-    # Extract main tarball
-    info(_g("unpacking %s"), $tarfile);
-    my $tar = Dpkg::Source::Archive->new(filename => "$dscdir$tarfile");
-    $tar->extract($newdirectory);
-
-    sanity_check($newdirectory);
-
-    my $old_cwd = getcwd();
-    chdir($newdirectory) ||
-	syserr(_g("unable to chdir to `%s'"), $newdirectory);
-
-    # Disable git hooks, as unpacking a source package should not
-    # involve running code.
-    foreach my $hook (glob("./.git/hooks/*")) {
-	if (-x $hook) {
-	    warning(_g("executable bit set on %s; clearing"), $hook);
-	    chmod(0666 &~ umask(), $hook) ||
-		syserr(_g("unable to change permission of `%s'"), $hook);
+    my ($bundle, $shallow);
+    foreach my $file (@files) {
+	if ($file=~/^\Q$basenamerev\E\.git$/) {
+	    if (! defined $bundle) {
+		$bundle=$file;
+	    }
+	    else {
+		error(_g("format v3.0 (git) uses only one .git file"));
+	    }
 	}
-    }
-
-    # This is a paranoia measure, since the index is not normally
-    # provided by possibly-untrusted third parties, remove it if
-    # present (git will recreate it as needed).
-    if (-e ".git/index" || -l ".git/index") {
-	unlink(".git/index") ||
-	    syserr(_g("unable to remove `%s'"), ".git/index");
-    }
-
-    # Comment out potentially probamatic or annoying stuff in
-    # .git/config.
-    my $safe_fields = qr/^(
-		core\.autocrlf			|
-		branch\..*			|
-		remote\..*			|
-		core\.repositoryformatversion	|
-		core\.filemode			|
-		core\.logallrefupdates		|
-		core\.bare
-		)$/x;
-    my %config = %{read_git_config(".git/config")};
-    foreach my $field (keys %config) {
-	if ($field =~ /$safe_fields/) {
-	    delete $config{$field};
+	elsif ($file=~/^\Q$basenamerev\E\.gitshallow$/) {
+	    if (! defined $shallow) {
+		$shallow=$file;
+	    }
+	    else {
+		error(_g("format v3.0 (git) uses only one .gitshallow file"));
+	    }
 	}
 	else {
-	    system("git", "config", "--file", ".git/config",
-		   "--unset-all", $field);
-	    $? && subprocerr("git config --file .git/config --unset-all $field");
+	    error(_g("format v3.0 (git) unknown file: %s", $file));
 	}
     }
-    if (%config) {
-	warning(_g("modifying .git/config to comment out some settings"));
-	open(GIT_CONFIG, ">>", ".git/config") ||
-	    syserr(_g("unable to append to %s"), ".git/config");
-	print GIT_CONFIG "\n# "._g("The following setting(s) were disabled by dpkg-source").":\n";
-	foreach my $field (sort keys %config) {
-	    foreach my $value (@{$config{$field}}) {
-		if (defined($value)) {
-		    print GIT_CONFIG "# $field=$value\n";
-		} else {
-		    print GIT_CONFIG "# $field\n";
-		}
-	    }
-	}
-	close GIT_CONFIG;
+    if (! defined $bundle) {
+	error(_g("format v3.0 (git) expected %s"), "$basenamerev.git");
     }
 
-    # .git/gitweb is created and used by git instaweb and should not be
-    # transferwed by source package.
-    system("rm", "-rf", ".git/gitweb");
-    $? && subprocerr("rm -rf .git/gitweb");
+    erasedir($newdirectory);
 
-    # git checkout is used to repopulate the WC with files
-    # and recreate the index.
-    system("git", "checkout", "-f");
-    $? && subprocerr("git checkout -f");
+    # Extract git bundle.
+    info(_g("cloning %s"), $bundle);
+    system("git", "clone", "--quiet", $dscdir.$bundle, $newdirectory);
+    $? && subprocerr("git bundle");
 
-    chdir($old_cwd) ||
-	syserr(_g("unable to chdir to `%s'"), $old_cwd);
+    if (defined $shallow) {
+	# Move shallow info file into place, so git does not
+	# try to follow parents of shallow refs.
+        info(_g("setting up shallow clone"));
+    	system("cp", "-f",  $shallow, "$newdirectory/.git/shallow");
+	$? && subprocerr("cp");
+    }
+
+    sanity_check($newdirectory);
 }
 
 1;
-- 
1.7.1

Attachment: signature.asc
Description: Digital signature


Reply to: