[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Dependency-based running of collection scripts



Raphael Geissert <atomo64+debian@gmail.com> writes:
> Russ Allbery wrote:

>> Checks are done in the main Lintian process.  Collection scripts are
>> done in the background.  If we've unblocked a check, the main Lintian
>> process can just go do all unblocked checks right away.  When it
>> comes back, it can see if any collection scripts have finished and
>> therefore any more checks are unblocked.

> The idea is:
> 1.- we are running nothing, kick off the first set of scripts (in the future
> this should be the unpack scripts, and since all the checks and collection
> scripts require at least unpack level one it blocks)
> 2.- so now we can start a bunch of collection scripts, let's do that.
> 3.- one of those collection scripts is done, it made available: 1 check, 0
> collection scripts. Let's work on that check.
> 4.- We are done with that check, we come back and look at the list of
> running jobs and we see that one is done, it made available: 1 check
> script, 2 collection scripts; let's first kick off those collection scripts
> and get back to the check script when we are sure *there's something else
> running it parallel*.
> 5.- Repeat 3 and 4 over and over again.

Right, that's what I just said...?  I think.  I'm not sure that I
followed your explanation, but it sounds to me like the same thing that
I described.

>> But that's exactly what the next_nodes function in my example gives
>> you.

> Yes, but what I mean is: I don't see the point in doing, basically,
> the same thing DepMap is already doing (and is already coded.)

The point for me is that it's considerably shorter and much less code,
which will make it easier to understand and maintain in the long run.

>>>>     # start a bunch of children storing PIDs in %children
>>>>     while (%children) {
>>>>         my $child = wait;

>>> The problem I see here is that we are again 'wait'ing, which in
>>> other words means: blocking.

>> It will only block if no children are finished, which is the one case
>> where we *do* have to block because we can't do anything else.

> But that doesn't play nice with the idea of running checks in the
> meanwhile on the main process.

Sure it does....  We only call wait if we can't proceed on the basis of
what we know is currently finished.  If we have anything else we can do,
we do that.  Otherwise, we call wait, which will only block if none of
our children have exited (which means that we *can't* do anything else
since we're still blocked).

> Sure, IPC::Run might not be the best, but it works and its
> implementation is irrelevant to the dependencies-driven processing
> change.

We seem to be talking past each other....  IPC::Run doesn't provide the
interface that we need to do this efficiently, which is why you have a
bunch of code in your patch for polling child processes rather than just
calling wait, which does the same thing in much less code.

> By the way, could you please elaborate the following a bit more?

>> But I'm concerned it may also be overkill and pretty complicated for
>> the problem we're solving.  I'm not sure that this is the right
>> approach, or at least I think it could be simplified a lot.

> If I remove the co-dependencies stuff would, in your opinion, make it
> less complicated?

Yes, but I think removing that and selectable reduces it to basically
the two functions in my previous message, unless I'm missing something.

> Only make sure that no matter in which order nodes are added we always
> get a consistent state. This eliminates the need of first gathering
> all the information about collection and checks and later carefully
> enter them (which is not only a waste of time, but complicated to
> achieve, since you would actually need a dependency resolver to enter
> the data in the right order).

I'm pretty sure the code in my previous message also does that.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: