[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bootstrapping: list of 81 self-cycles in Debian Sid



On 13145 March 1977, Johannes Schauer wrote:

>> And it sounds like something that could be done using the archives tools /
>> integrated into them. If you are interested to integrate it there properly,
>> we are in #debian-ftp on irc.debian.org and also debian-dak@lists.debian.org
> So, I now joined #debian-ftp and subscribed to debian-dak@l.d.o. What would the
> best way be to integrate the information about self cycles?

Discuss with us. :)

>> And as they are generated completly out of our archives postgres
>> database, that one could be used too, probably not hard to change. I
>> wonder if one could "offload" a bit of the work to sql too to help.
> The code uses the Cudf representation of binary and source packages in the
> Packages.bz2 and Sources.bz2. I dont think a database can lead to any speedup
> except if it is a Cudf database which caches installation sets, closures etc...

Right. Cudf is currently not what I know. :)
And no, our database is postgres.

>> 500MB isnt really much space. And as they are mostly for the
>> Packages/Sources, its much less for the output you generate...  That is,
>> ideally this generates just "index" files, which are then consumed by
>> something like the PTS.
> "index" files? What are they?

Whatever you choose them to be. 822, json, something that other systems
like the PTS can easily consume.
Not sure how useful "real" html would be, they can also be
generated. But the best for such information is to provide them for
something central.

> The Ocaml code which does all the dependency resolution currently outputs JSON
> files which are then turned into html by a python script.

How hard would it be to have the whole thing in python?

> The quick and dirty implementation can be seen here:
> https://gitorious.org/debian-bootstrap/bootstrap/trees/master/webselfcycles

>> Also, would "incremental" runs work? Say, the database tells you which
>> packages changed recently due to uploads. Only recheck the parts affected by
>> it.  Yes, requires state storage.
> In theory I guess yes, it would be possible to make it work incrementally. But
> in practice implementing that would probably be another GSoC as dose3 cant do
> anything incremental by now. So I dont think the amount of required work would
> justify the result.

Ok.


Right. So. We have to see how to best handle this. If the code *can*,
without too much work, be in python, then I think it should be fully
integrated into dak.[1] Probably an own dak sub-command that we can call
from cron.

If that is not feasible, then we have to work so that we are just
"customers" of your code. Cloning it and running it, but hell no can we
touch Ocaml. :) And when we change stuff that would need changes in the
code too, we would keep you in the loop. (Would be database related to
the information you need to get from there. Though we currently dont
plan big reworks here... ).

Now, when you cloned dak.git, you might want to take a look into setup/
which contains schema information (combined with dak/dakdb updates >>
that schema version). Actually, the readme should let you get an own
test setup pretty easy.
For the sql we take to get the packages/sources information, see
dak/generate-packages-sources2.py.

[1] https://ftp-master.debian.org/git/dak.git/

-- 
bye, Joerg
Das kannst du vielleicht mir erzaehlen, aber nicht jemanden, der Ahnung hat.


Reply to: