Re: RFC: further parallelisation (dependency-based collection and check scripts)

To: debian-lint-maint@lists.debian.org
Subject: Re: RFC: further parallelisation (dependency-based collection and check scripts)
From: Raphael Geissert <geissert@debian.org>
Date: Sun, 27 Dec 2009 02:36:59 -0600
Message-id: <[🔎] hh76be$pp8$1@ger.gmane.org>
References: <h9er7b$nse$1@ger.gmane.org> <[🔎] 878wcoonc6.fsf@windlord.stanford.edu>

Russ Allbery wrote:
> Raphael Geissert writes:
>> Other things to consider:
>> * It is at the moment impossible to run the checks as soon as they could
>> because the overrides are usually not loaded by then. The current
>> workaround is to wait until the override file has been collected and
>> loaded, but it would be better if the Tags module knew when it is ready
>> to start printing the results and use a cache in the meanwhile[2].
> 
> Oh, yes, that's a good idea.  I don't think that Lintian::Tags should poll
> the collect area directly, but I think it makes a lot of sense to cache
> all the tags until the frontend (or, later, the relevant module) calls
> file_overrides to load the overrides for that file, or file_end to say
> that we're done with that file without processing any overrides.

I certainly need to get more familiar with the new Lintian::Tags* methods,
but sounds good.

> 
> (The logic of Lintian::Tags will get simpler when we promote changes files
> to a first-class checkable object with its own check scripts.)

I haven't put more though on that, but I believe you :)

> 
>> * I'm not entirely happy with the need to prefix the collection and
>> checks before they are added to the dependencies tree, but that's needed
>> because of the name conflicts between them.
> 
> We could rename them so that there are no naming conflicts, although I
> don't mind tagging them explicitly.

That'd be great as some scripts could use a more suitable name.

> (Hm, I wonder it if would be 
> worthwhile to have DepMap or PDepMap handle that internally -- take a type
> as well as a name and internally munge that into a unique identifier.)

It would be PDepMap the one that would handle that.
My only, soft, objection is that the idea behind PDepMap is to let the
application layer add whatever as a property of a node. Mostly to avoid
extra data structures that hold information about a given node.

It should also be possible to make PDepMap take an, optional, reference to a
function that operates on the node properties and returns a node name that
should be used instead (in the 'sort' spirit).  The only problem I see with
this approach is that the application layer still needs to know how to
refer to a given node.

> 
>> * Probably reconsider the name of Lintian::DepMap; after all, it creates
>> dependencies trees (the original name was based on the idea of
>> supporting more complex kinds of relationships which could make a graph
>> look more like a map than a tree).
> 
> I'm good either way.

Do you have any suggestion for a new name?

> 
>> [2] This cache could even be used to store the results of the whole
>> package check so that the tags could be printed in order of severity, as
>> some people have suggested (at least on IRC).
> 
> Hm, yeah, we could do that, although it would make Lintian appear slower
> since it would have to hold all tags until the processing of the file
> completes.

It could be added as an option.

> 
>> Feedback is very much welcomed.
> 
> The code looks basically good to me after a once-over.  I think it's
> certainly good enough to commit and we can sort out other things later,
> although it would be nice to fix the Lintian::Command bits first.

Definitely. Lintian::Command is the biggest blocker here.

> 
> A few minor stylistic notes: Lintian currently fairly uniformly uses
> underscores_between_words rather than studlyCaps for methods, and I'd like
> to stick with that (since it's also the perlstyle recommendation).

Okay.

> And 
> Lintian::PDepMap, since it's a subclass, should probably use the subclass
> namespace (Lintian::DepMap::Properties or something like that, of course
> with changes if you rename DepMap to something else).
> 

Indeed.

Thanks for taking the time to review all these changes.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net

Reply to:

Follow-Ups:
- Re: RFC: further parallelisation (dependency-based collection and check scripts)
  - From: Russ Allbery <rra@debian.org>

References:
- Re: RFC: further parallelisation (dependency-based collection and check scripts)
  - From: Russ Allbery <rra@debian.org>

Prev by Date: Re: RFC: getting rid of unpack-level
Next by Date: [SCM] Debian package checker branch, master, updated. 2.3.0-4-g146dd9a
Previous by thread: Re: RFC: further parallelisation (dependency-based collection and check scripts)
Next by thread: Re: RFC: further parallelisation (dependency-based collection and check scripts)
Index(es):
- Date
- Thread