[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RFC: implementation of package pools

Hash: SHA1


For the last month and a half, I've been working on re-implementing
dinstall and switching to package pools.  Here are the details.
Comments would be appreciated, _but_ I'm really not looking for
"Wouldn't it be nice if" or "this <small detail> is broken" type
comments.  One of the points of re-implementing dinstall is so that I
can work on it, fix bugs etc., but that's for the future, my concern
right now is with nothing other than getting it up and running in
place of the current system, as soon as possible.


How this will affect people


  For now, they shouldn't notice many changes that really matter.  You
  keep uploading to Incoming, you can still run 'dinstall -n
  yourpackage*.changes' to check to if it will be installed, etc.


  Shouldn't notice any changes for the most part.  People using older
  dselect methods and/or dpkg -R might have issues... err, is this
  really a problem?


  Obviously some pretty big changes :->.  People wanting to mirror
  single architectures will obviously have to do some excluding.  This
  is trivial on just about any mirror program/script I've even seen,
  so I don't see this as a problem.  If someone enterprising person
  wants to make a nice easy HOWTO document, that would be lovely.  We
  lose the ability to mirror a single distribution but it's
  questionable whether this was trivially possible in the first place.
  I don't see a way round this that still implements true package
  pools and I think it's an acceptable trade off for the benefits that
  package pools gives us.

The programs

The system is called `da-katie' and is currently in
ftp-master.debian.org:~troup/katie/.  Don't mind the silly names,
those will change in the binary package to something more sensible
like "da-install" (for dinstall), "da-clean" etc.  It's written in
python[1] and there are several parts to it, but the only one
interesting to random developers is 'katie' (aka "da-install"), which
is a command line and output (don't ask) compatible re-implementation
of dinstall.

How the pool is structured

The package pool will be in 


The pool is split by <component>/<first letter of source name>/<source name>/.
All source and binaries for all distributions go into the same
pool directory.  e.g.:


An exception to this rule is library packages which "lib" packages
which are split into <component>/<first 4 letters of source name>/<source name>.  e.g:


Obviously Packages+Sources files will still go to the dists/ tree.

The Database

SQL database using postgresql.  See Jason's


for details.  This is an *initial minimal schema*.  From my point of
view, it's raison d'etre is to get da-katie + package pools + testing
up and running.  I realise it probably doesn't contain everything that
other people might want in it.  I'd like _after da-katie is
implemented_ to get everyone with genuine interest (e.g. BTS, WWW
folks) involved in extending the schema to do what they need.


Implementation is relatively (haha) simple.  There will be no
migration or anything needed.  'da-install' will simply replace
'dinstall' and packages will start appearing in the pool instead of
the legacy dists/ tree.  For woody, an as-of-yet-unwritten tool will
migrated n Mb of data a day into the pool from the legacy dists/ tree.
Potato will stay as it is (obviously).

The plan is to initially implement 'da-katie on ftp-master, and do
non-US.debian.org later.  There will only be one database on
ftp-master containing the data for both ftp-master and non-US.


For various reasons, it would be much easier for this to wait till
post 2.2r1, so at the moment, I'm looking at a timescale of roughly
two weeks before implementing.  The code is basically complete (see
the TODO for details of show stopper bugs) and working.  I'm doing
lots of testing and apart from the inevitable minor problems[2]
everything looks good.  In any event I don't anticipate serious
problems and even horrible data-poisoning style breakage should be
recoverable as the plan is do DB dumps before and after the daily cron


I'm only really responsible for the code in da-katie, the DB Design
work was done by Jason and AJ, based in part on the earlier work of
Drake Diedrich.

- -- 

[1] Don't anyone even _bother_ to comment on this.  If you don't like
it go away and do the work of a real reimplentation in another
language and we'll talk.

[2] dinstall, despite it's small size, is non-trivial and has to
handle an annoying amount of unpredictable corner cases when it comes
to broken packages.
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.4 and Gnu Privacy Guard <http://www.gnupg.org/>


Reply to: