Re: Bootstrapping: list of 81 self-cycles in Debian Sid

To: Johannes Schauer <j.schauer@email.de>
Cc: debian-devel@lists.debian.org
Subject: Re: Bootstrapping: list of 81 self-cycles in Debian Sid
From: Joerg Jaspert <joerg@debian.org>
Date: Wed, 06 Mar 2013 22:27:00 +0100
Message-id: <[🔎] 87hakop6cb.fsf@gkar.ganneff.de>
Mail-followup-to: Johannes Schauer <j.schauer@email.de>, debian-devel@lists.debian.org
In-reply-to: <[🔎] 20130306175647.26281.72388@hoothoot> (Johannes Schauer's message of "Wed, 06 Mar 2013 18:56:47 +0100")
References: <[🔎] 20130305112246.25881.60404@hoothoot> <[🔎] 20130305124112.GN6378@type.bordeaux.inria.fr> <[🔎] 20130305134151.25881.28816@hoothoot> <[🔎] 20130305134720.GD29994@grep.be> <[🔎] 20130305140743.25881.12276@hoothoot> <[🔎] 87mwuhosrz.fsf@gkar.ganneff.de> <[🔎] 20130306175647.26281.72388@hoothoot>

On 13142 March 1977, Johannes Schauer wrote:

>> That obviously depends a bit on what is actually needed to run (and then on
>> talking to DSA, but they don't bite so much :) ).
> If it is found useful, then I have to figure out who to contact about
> this.

For it to live on debian.org seperately , thats DSA,
debian-admin@lists.debian.org, until it goes into the umbrella of an
existing team, then that will do that for you.

>> I guess you need a mirror (or at least packages/sources files) as input,
>> though you might want to check if you can use an existing database within
>> Debian to just use the already exsiting data.
> Yes, the input to the code is just a pair of Packages.bz2 and Sources.bz2
> files.

And as they are generated completly out of our archives postgres
database, that one could be used too, probably not hard to change. I
wonder if one could "offload" a bit of the work to sql too to help.

> For the output you see in the link above, they total to a size of 500MB. If
> they can be retrieve directly from somewhere on the same machine, then it would
> naturally save lots of space.

500MB isnt really much space. And as they are mostly for the
Packages/Sources, its much less for the output you generate...  That is,
ideally this generates just "index" files, which are then consumed by
something like the PTS.

> RAM:
> The highest amount I observed was 260 MB of used resident memory. This value is
> that high because the build dependency graph and the strong dependency graph of
> the whole distribution has to be kept in memory at the same time.

Not much.

> CPU:
> The whole script producing the output above took 7 hours to run on a 2.5GHz
> Core i5 for all suites and all architectures (38 combinations). This is because
> generating strong strong dependencies for all packages in the archive takes
> 8-10 minutes with current archive sizes.  I dont think this value can
> considerably be lowered. On the other hand, the cron job doesnt have to be run
> every day but maybe once a week or once a month?

Thats the only interesting part, but if one does it in background, once
a week, properly in parallel, it shouldn't be too bad.

Also, would "incremental" runs work? Say, the database tells you which
packages changed recently due to uploads. Only recheck the parts
affected by it.
Yes, requires state storage.

And it sounds like something that could be done using the archives tools
/ integrated into them.
If you are interested to integrate it there properly, we are in
#debian-ftp on irc.debian.org and also debian-dak@lists.debian.org

-- 
bye, Joerg
Dad, you've done a lot of great things, but you're a very old man, and
old people are useless.

Reply to:

References:
- Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Johannes Schauer <j.schauer@email.de>
- Re: Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Samuel Thibault <sthibault@debian.org>
- Re: Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Johannes Schauer <j.schauer@email.de>
- Re: Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Wouter Verhelst <wouter@debian.org>
- Re: Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Johannes Schauer <j.schauer@email.de>
- Re: Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Joerg Jaspert <joerg@debian.org>
- Re: Bootstrapping: list of 81 self-cycles in Debian Sid
  - From: Johannes Schauer <j.schauer@email.de>

Prev by Date: Re: DM upload permission
Next by Date: Bug#702464: ITP: python-django-bitfield -- Django module implementing BitFields
Previous by thread: Re: Bootstrapping: list of 81 self-cycles in Debian Sid
Next by thread: Re: Bootstrapping: list of 81 self-cycles in Debian Sid
Index(es):
- Date
- Thread