[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: The annual git/svn discussion (was: Re: Minutes of the pkg-perl BoF at DebConf 10)



On 8 August 2010 14:13, Jeremiah Foster <jeremiah@jeremiahfoster.com> wrote:
> On Aug 7, 2010, at 23:35, Tim Retout wrote:
>> I assumed we would want one git repository per package.  Now, 1700 git
>> repositories turns out to be quite difficult to make perform as
>> quickly as a single svn trunk checkout.
>
> I'm not convinced of this. In fact, I'm pretty sure that any update will be a lot faster than svn, because git just transmits the delta off the bat while svn goes through the entire repo looking into every .svn file.

There are overheads with dealing with separate repositories which make
getting good performance difficult - I have no doubt it is possible
with some work, but it is not straightforward.  For instance, as I
mentioned, one of the main overheads that might not have been obvious
was that of creating one ssh connection for every repository to be
updated.

It turns out 'mr' does support parallel updates, which is good, so I
misremembered that.  It was the ssh connection sharing that I don't
think was supported.

> The model of a single git repo with git submodules per package ought to make this significantly faster.

I doubt that actual git submodules can be made to work intuitively for
what we want - they are always tied to a specific version. (I suppose
we might manage to script away the updating of the submodule status in
the main git repo?) But ansgar had a nice idea for our case involving
the data that PET makes available about the git heads - we might only
need to run 'git pull' on a subset of the repositories, with a bit of
scripting, which is ultimately what we need to make this work.

>> Caveats: I did this testing one year ago (so not with git 1.6), and I
>> found out afterwards that I was using really very slow disks (SSDs
>> rather than HDDs).  The disk speed could really affect this.
>
> I'm confused. If SSD you mean solid state drive, that should be significantly faster.

SSDs have a much faster average seek time, but the actual maximum
read/write transfer rate is generally slower than HDDs. Especially the
larger SDD in my eeepc 1000, which is what I was trying to use. Adding
a USB 2.0 HDD made everything much faster, I found later.

-- 
Tim Retout <diocles@debian.org>


Reply to: