Re: Dependency-based running of collection scripts

To: debian-lint-maint@lists.debian.org
Subject: Re: Dependency-based running of collection scripts
From: Russ Allbery <rra@debian.org>
Date: Sat, 26 Dec 2009 21:37:39 -0800
Message-id: <[🔎] 87tyvdou3g.fsf@windlord.stanford.edu>
In-reply-to: <guoatf$96m$1@ger.gmane.org> (Raphael Geissert's message of "Sun, 17 May 2009 01:31:01 -0500")
References: <grrvv4$1s5$1@ger.gmane.org> <874ovto5co.fsf@windlord.stanford.edu> <gu5uto$kik$1@ger.gmane.org> <87bpq1l6xh.fsf@windlord.stanford.edu> <gunroe$ft3$1@ger.gmane.org> <87ljowmmni.fsf@windlord.stanford.edu> <guoatf$96m$1@ger.gmane.org>

Sorry it took me so long to reply to this.  I see that all sorts of things
went into my to-reply list right before the conference I was running last
year heated up, and then I never got back to them.

Raphael Geissert <atomo64+debian@gmail.com> writes:
> Russ Allbery wrote:
>> Raphael Geissert writes:

>>> The idea is:
>>> 1.- we are running nothing, kick off the first set of scripts (in the
>>> future this should be the unpack scripts, and since all the checks and
>>> collection scripts require at least unpack level one it blocks)
>>> 2.- so now we can start a bunch of collection scripts, let's do that.
>>> 3.- one of those collection scripts is done, it made available: 1 check,
>>> 0 collection scripts. Let's work on that check.
>>> 4.- We are done with that check, we come back and look at the list of
>>> running jobs and we see that one is done, it made available: 1 check
>>> script, 2 collection scripts; let's first kick off those collection
>>> scripts and get back to the check script when we are sure *there's
>>> something else running it parallel*.
>>> 5.- Repeat 3 and 4 over and over again.

>> Right, that's what I just said...?  I think.  I'm not sure that I
>> followed your explanation, but it sounds to me like the same thing that
>> I described.

> I tried to stress the point of starting now selectable() collection
> scripts prior running any check script,

Ah, yes, that's a good point.  We should run the collection scripts first
since we can background them.

>>> Yes, but what I mean is: I don't see the point in doing, basically,
>>> the same thing DepMap is already doing (and is already coded.)

>> The point for me is that it's considerably shorter and much less code,
>> which will make it easier to understand and maintain in the long run.

> All the code is documented and tested, which provides a working example
> of the intended use. Maintenance of the code should really be minimal.

> The select()/satisfy() approach adds the possibility to keep partial
> states, so that if for any reason one does not proceed in the linear way
> as proposed by your approach it is possible to stop and continue later,
> or work at the same time.

> I'm afraid merging those two states doesn't convince me.

What's making me nervous is that I think the solution is pretty complex
for what's needed for the dependency management, and I tend towards the
"minimum possible required code" approach that was popularized by the XP
methodology.

But that being said, I really don't want to block you from continuing over
that, particularly since I'm not replying very quickly.  And you're of
course right: you have already tested code that you know works.  So I
withdraw my objection here, although I still have a preference for doing
the simplest solution that will work.

>> Sure it does....  We only call wait if we can't proceed on the basis of
>> what we know is currently finished.  If we have anything else we can
>> do, we do that.  Otherwise, we call wait, which will only block if none
>> of our children have exited (which means that we *can't* do anything
>> else since we're still blocked).

> I see what you mean. Maybe I got distracted by the fact that wait only
> applies to the forked processes, but doesn't apply to check scripts when
> I first read it. The latter would be handled somewhere else.

Well, currently we're running check scripts in the main process, so if
we're running the check script, by necessity we're not waiting.  We have
to serialize the check scripts with the current architecture.  So we'd run
all the check scripts we can, then go back and see if anything finished.
If nothing finished, then we wait until a child process exits, at which
point we go do whatever's been freed up.

One really doesn't want to poll children, since it's a busy-wait loop.
One wants to have the kernel suspend the parent process until a child
process finishes, which is what wait will do for you.

>>> Sure, IPC::Run might not be the best, but it works and its
>>> implementation is irrelevant to the dependencies-driven processing
>>> change.

>> We seem to be talking past each other....  IPC::Run doesn't provide the
>> interface that we need to do this efficiently, which is why you have a
>> bunch of code in your patch for polling child processes rather than
>> just calling wait, which does the same thing in much less code.

> No. What I was saying is that that part is specific to the
> Lintian::Command code, and its implementation should not interfer with
> the rest of the code.  That is, no matter whether we use IPC::Run or
> fork, the interface should be the same and the rest of the code should
> therefore need no change.

Abstracting this work in Lintian::Command sounds good, but it's going to
be tricky to fit that into the existing interface.  Lintian::Command
currently views the world one command at a time.  To provide an API for
this, we need something that manages multiple running processes and can
provide an additional layer that you can query for the list of ones that
finished or to wait for one to finish.

The only point I was making is that, underneath that implementation,
running these commands should probably use something other than IPC::Run,
since IPC::Run doesn't give you the PID and therefore means that one can't
use wait in its most natural form.  One has to instead poll after each
child exits to see which specific child exited, when an interface that
gave you the PID, such as a normal fork, would let you immediately look up
which child exited.

>> Yes, but I think removing that and selectable reduces it to basically
>> the two functions in my previous message, unless I'm missing something.

> Like I mentioned above, it leaves the two states functions (select,
> satisfy), which since you merged them together it looked like you
> considered them to be an overkill. Is that right?

I do think that a separate select and satisfy is overkill in that I think
we can build this system without it, but if it's clearer to you to do it
with select and satisfy, I think that's fine too.

> Summarising:
> * Co-dependencies will be replaced by another layer between the frontend
> and the dependencies resolver (by the way, you didn't comment on my
> points, may I imply you are ok with them?.)

Yup!  Particularly at this point.  :)

> * Lintian::Command needs to be refactored so that it is easier to start
> multiple jobs where pipes are not needed.

Right.

> * export() is to be removed, and its cloning functionality replaced by
> on-the-fly map regeneration
> * an initialise() method will ensure the resolver's state, eliminating
> cloning

That sounds right.

> Other changes I think they should be made:
> * Remove the dependency on oneself in add()
> * Add a test that creates the map just like the frontend would and looks
> for missing and circular[2] dependencies (this would in turn require
> some refactoring to avoid code duplication between the frontend and the
> test, and would deprecate the needs-info test[1].)
> * select() should return 0 instead of dying if the given node is already
> selected

These sound good as well.

> [2] A deep circular dependencies finder could probably be implemented,
> basically trying to resolve the map until the point where %map is empty
> and %nodes is either empty (representing a fully resolvable map) or not
> (representing an unresolvable situation).

That sounds like a good idea.

> Comments?

Sounds great -- please go ahead, and I'm happy to review code.  It might
be easier to do the Lintian::Command refactoring separately and get that
in before doing the frontend restructuring, since we can use the
management of multiple processes to clean up other places, such as in some
of the collect scripts that currently background several things.

Incidentally, while you're working on this, if you feel inspired, Checker
should become Lintian::Checker or something similar, and as much of this
code from the frontend should move into a library as possible.  My very
long-term goal is that I'd like to get the frontend/lintian script down to
just command-line parsing and library setup, and glueing things together,
and get all the core work into libraries as much as possible.  I also want
to start installing Lintian's Perl libraries into the regular Perl search
path.  That way, one could embed Lintian in any other Perl application as
a module, which opens some interesting use cases.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>

Reply to:

Follow-Ups:
- Re: Dependency-based running of collection scripts
  - From: Raphael Geissert <geissert@debian.org>

Prev by Date: Bug#548640: marked as done (lintian check for dpatch describtion could honnor DEP3 style description)
Next by Date: Re: Restructuring check scripts
Previous by thread: Bug#548640: marked as done (lintian check for dpatch describtion could honnor DEP3 style description)
Next by thread: Re: Dependency-based running of collection scripts
Index(es):
- Date
- Thread