Distributed Debian Distribution Development
around three years ago i wrote an article recommending that debian
move step-by-step towards distributed peer-to-peer infrastructure,
thus reducing the reliance on server infrastructure, thus potentially
allowing sponsorship funds and resources to be retargetted to other
areas which would improve the debian distribution.
mostly that was words, not actual code, and, before getting all upset
at how little progress has been made (cameron dale's fantastic apt-p2p
work literally being the only exception ) i decided a few days ago to
do something about that situation, so that i wouldn't come across as
being a complete spongeing whining knob.
my contribution is to prove that a combination of git and bittorrent
is actually possible:
(anyone wishing to help / contribute, do contact me: i'll happily give
you access to the repository if you can spot the irony of this
it's turned out to be much, much simpler than i thought, by the simple
expedient of turning git commits into a "virtual filesystem" where the
git-pack-objects are stored as files, named by their commit reference.
hooking into the bittornado client's "file close" operation is enough
to fire off a test for whether the pack object is valid (check its
signature at the beginning, and the SHA-1 signature at the end), and
if it is, to run a "git unpack" operation.
what are the implications, and why is combining git with bittorrent a
big hairy deal?
* alioth, a single server, could run git torrent trackers, and the
load on the server greatly reduced. and it no longer becomes a
* joey hess's excellent "ikiwiki", which can be configured to use git,
could be used as the basis for a distributed wiki.
* again: ikiwiki or something similar could well be used as the basis
for a distributed bugtracker.
* mailing list archives could be stored in a git repository, and,
instead of being mailed to tens of thousands of users, a slow
migration to simply... sharing the email messages via bittorrent can
be performed. this would *significantly* reduce the load on the mail
servers, which i understand to be ... a leeetle bit stressed.
* with a little bit of work, the debian source archives could be
distributed as "unpacked" tarballs. and, combined with
git-buildpackage, suddenly the possibilities open up...
each and every one of these ideas - and there are going to be plenty
more that other people smarter than me can think of - would be nothing
more taxing than writing commands and tools which perform "git push"
and "git pull" operations.
the only "stumbling block" that would make this completely impractical
is the fact that, for robustness, the git transfer operations need to
be GPG/PGP digitally signed by a web of trust.... .... and fuck me
sideways with a rubber spoon: what does debian _already_ have?? one
of the largest operational GPG/PGP webs of trust of any free software
group on the planet, with a userbase who are _already_ used to the
concept of digitally signing "stuff".
think about this: if .deb archives were stored in a git repository,
creating a new stable release could become a matter of signing a git
tag with the debian keyring, doing "git push" and going "ahhhh, that
was hard work. beer, anyone?" :)
the announcement of the release would also be automatically committed
(and GPG signed!) to the relevant mailing list git repository at the
same time. both the announce and the tag signing would
*automatically* result in the creation of updated "torrent"s,
automatically resulting in people creating mirrors of the stable
archive by the simple expedient of their cron job'd "git pulls"
triggering at intervals. would the debian servers and mirrors be
stressed out by this, at all? i don't think so!
i've made the "first move" - i've written the proof-of-concept which
proves that git _can_ be successfully combined with the bittorrent
protocol: it's up to you, the debian developers, to consider whether
the concept has merit and whether it's worthwhile pursuing.
i leave you with this: the idea of the "Freedom Box" caused quite a
stir at debconf2010, but i honestly doubt that, without any experience
of getting *yourselves* off of the client-server paradigm, there will
be any success in getting *the average person* off of the
client-server paradigm. so i believe that by converting the debian
development process and infrastructure to peer-to-peer distributed
infrastructure *first*, and by creating aggregation and migration
infrastructure for *yourselves*, the people who do that will be in a
much stronger position to help convert other types of internet
services to a distributed architecture. and, will likely have written
a good proportion of the necessary infrastructure along the way.
ghandi said "be the change you want to see in the world". this
translates, roughly, to "eat your own dogfood".
so - something to think about.