Re: RFC: further parallelisation (dependency-based collection and check scripts)

To: debian-lint-maint@lists.debian.org
Subject: Re: RFC: further parallelisation (dependency-based collection and check scripts)
From: Russ Allbery <rra@debian.org>
Date: Sun, 27 Dec 2009 12:44:55 -0800
Message-id: <[🔎] 878wcono3c.fsf@windlord.stanford.edu>
In-reply-to: <[🔎] hh76be$pp8$1@ger.gmane.org> (Raphael Geissert's message of "Sun, 27 Dec 2009 02:36:59 -0600")
References: <h9er7b$nse$1@ger.gmane.org> <[🔎] 878wcoonc6.fsf@windlord.stanford.edu> <[🔎] hh76be$pp8$1@ger.gmane.org>

Raphael Geissert <geissert@debian.org> writes:
> Russ Allbery wrote:

>> Oh, yes, that's a good idea.  I don't think that Lintian::Tags should
>> poll the collect area directly, but I think it makes a lot of sense to
>> cache all the tags until the frontend (or, later, the relevant module)
>> calls file_overrides to load the overrides for that file, or file_end
>> to say that we're done with that file without processing any overrides.

> I certainly need to get more familiar with the new Lintian::Tags*
> methods, but sounds good.

When you get a chance to look it over, I'd be curious about your thoughts
on the best module architecture.  I'm struggling to figure out what makes
the most sense.  I think combining the tag, pkg_start, and pkg_end methods
in with Lintian::Output would make sense, along with the configuration for
which tags are displayed.  I'm not sure how to handle overrides.  I wonder
if we should have a Lintian::Tag::Overrides class that parses override
files and answers questions about them, such as whether a tag is
overridden, but I'm not sure where the code looking for unused overrides
should live or how it should work.

>> (The logic of Lintian::Tags will get simpler when we promote changes files
>> to a first-class checkable object with its own check scripts.)

> I haven't put more though on that, but I believe you :)

There's special code in pkg_start and pkg_end right now to not treat
changes files like regular package files, which can then go away.

>> We could rename them so that there are no naming conflicts, although I
>> don't mind tagging them explicitly.

> That'd be great as some scripts could use a more suitable name.

Certainly no problems here.  They're not visible to anything outside of
Lintian.

>> (Hm, I wonder it if would be worthwhile to have DepMap or PDepMap
>> handle that internally -- take a type as well as a name and internally
>> munge that into a unique identifier.)

> It would be PDepMap the one that would handle that.  My only, soft,
> objection is that the idea behind PDepMap is to let the application
> layer add whatever as a property of a node. Mostly to avoid extra data
> structures that hold information about a given node.

> It should also be possible to make PDepMap take an, optional, reference
> to a function that operates on the node properties and returns a node
> name that should be used instead (in the 'sort' spirit).  The only
> problem I see with this approach is that the application layer still
> needs to know how to refer to a given node.

Yeah, the ideal would be for the application to refer to a node in all
situations as a type/name pair, and have any other representation be
strictly internal, but that may be more work than it's worth.

>>> * Probably reconsider the name of Lintian::DepMap; after all, it
>>> creates dependencies trees (the original name was based on the idea of
>>> supporting more complex kinds of relationships which could make a
>>> graph look more like a map than a tree).

>> I'm good either way.

> Do you have any suggestion for a new name?

How about Lintian::Order, since what it's doing is creating and
manipulating a partial order?

>> Hm, yeah, we could do that, although it would make Lintian appear
>> slower since it would have to hold all tags until the processing of the
>> file completes.

> It could be added as an option.

Good point, and it would be fairly easy to implement as something passed
into the Lintian::Tags layer.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>

Reply to:

Follow-Ups:
- Re: RFC: further parallelisation (dependency-based collection and check scripts)
  - From: Raphael Geissert <geissert@debian.org>

References:
- Re: RFC: further parallelisation (dependency-based collection and check scripts)
  - From: Russ Allbery <rra@debian.org>
- Re: RFC: further parallelisation (dependency-based collection and check scripts)
  - From: Raphael Geissert <geissert@debian.org>

Prev by Date: Processed: merging 527026 541262
Next by Date: Bug#562789: lintian warns about multibyte errors in man pages, but doesn't say where
Previous by thread: Re: RFC: further parallelisation (dependency-based collection and check scripts)
Next by thread: Re: RFC: further parallelisation (dependency-based collection and check scripts)
Index(es):
- Date
- Thread