[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

d-i git repo: sample conversion



We discussed switching to git at DebConf, and came up with a compromise
plan that seemed reasonable to all in attendance[1]: Split each d-i
package into its own git repository[2], but leave the manual[3] and
packages/po[4] in svn. Investigate using submodules[6] and/or mr
to tie the git repositories together.

Since then, I've been working on a sample conversion to git, using the
svn-all-fast-export tool that was used to split up the KDE and Gnome
projects' svn repositories. I finally have a good quality conversion.
Max has been quality checking my attempts, and has verified that in
this conversion:

* Every git tag contains an identical tree to the one that was
  originally tagged in svn.
* The current version of each package in git is identical to the current
  version in svn.

So the purpose of this mail is to give it some wider exposure and
explain a few of the choices and tradeoffs made in creating it. Note
that since this is a sample, it might be deleted/changed at any time;
don't try to commit to it or use it for real work.

To browse or clone individual repositories, see
<http://hydra.kitenet.net/~d-i/git/>

To check out a complete tree, which should be identical to the current
tree except for using git for most things:
	svn checkout svn://svn.debian.org/d-i/people/joeyh/gitdemo
	cd gitdemo
	mr -p checkout

To test how slow updating every repository would be, compared to svn up,
running 8 updates in parallel[5]:
  	mr -j8 -p update

----

In creating this I had to deal with lots of package renames, etc
in the svn history. Mostly these can be handled well; for example
base-config-prep was later renamed to sarge-support, and so the
sarge-support.git repository contains the old base-config-prep
history as well.

In a few cases, I instead used branches to store ancestors of current stuff.
For example, localechooser has branches for countrychooser and
languagechooser[8], which led directly to it; flash-kernel has branches for
glantank and nslu2-firmware-installer, which were both parents of it;
libdebian-installer has a libd-i branch for the old version before it was
renamed[7]; and debian-installer has an old-docs branch that holds
various documentation that was scattered around the svn tree historically.

Some old stuff that didn't really lead to anything current was still
separated out into its own git repositories, in the attic subdirectory,
<http://hydra.kitenet.net/~d-i/git/attic/>

----

My todo and errata list is as follows:

convert svn:ignore props into .gitignore files.
(Will need to be done manually after conversion probably)

Remove $Id$ keywords from all files that have them; git doesn't
support that. (Do after conversion)

Currently I skip part of r59949, the creation of a debian/install file.
I had to skip it for reasons beyond my comprehension, but the file
was deleted 2 revs later, so no horrible loss of data.

/branches/d-i/kfreebsd is not included. I could add kfreebsd
branches to all affected repos, or they could finish merging to trunk
and then it wouldn't be needed.

Other whole-tree tags and branches are not included. These include sarge,
etch, and lenny tags and branches, and old tags like rc1, rc2, rc3.

The /people directory is mostly not included. These are some exceptions;
things like partman-auto-crypto, oldsys-preseed, rootskel-gtk, and
apt-setup started life in /people, and that history was included
in their git repos. And I pulled busybox out of /people. But branches in
/people are not included, and current thinking is that People will
be responsible for making equvilant branches in git for anything
in /people they care about.

I am not currently filtering out the da.po miscommits. Leaving them in
increases the total git repo size by 5 mb. If they are removed, there
are two ways to do it, that affect tags made during the problem
period in different ways:

1. Keep da.po miscommits in the tags. This bloats the git repo size
   about the same amount, and means that those tags will not be
   connected to the rest of the repo history.
2. Drop the bad da.po changes from everything, including tags. That
   would mean that 17 tags don't match what was originally committed
   to svn, but would otherwise work great.

-- 
see shy jo


[1] Even to those present who personally preferred other systems like bzr.
    bzr-git is pretty good and will be getting better.
[2] Rather than one big repository because per-package repositories are
    much more usual in git and because this allows tags to be handled more
    sanely wihout needing to worry about tag namespaces. This is certianly
    a tradeoff though.
[3] Because the manual's translation system depends on svn revision numbers,
    and converting it to git would be really hard. Maybe later?
[4] To avoid needing to switch all the translators to git.
[5] This test is not very accurate because it's not connecting to
    alioth, and because it's using anonymous git://, not ssh. Still,
    here it takes about 18 seconds, compared to about 9 seconds for a full
    no-op svn up in the d-i tree. 
[6] I have serious doubts about submodules, but FWIW debian-perl is also
    investigating submodules in a similar situation.
[7] This branch was needed due to a complex limitation of
    svn-all-fast-export.
[8] But that means that I did not store the tags for countrychooser and
    localechooser since they would conflict with other tags in the same
    git repo. I'm open to putting them in the attic instead if that
    seems better.

Attachment: signature.asc
Description: Digital signature


Reply to: