[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Any volunteers for lintian co-maintenance?



Niels,

On Friday, May 10, 2024 3:18:29 AM MST Niels Thykier wrote:
> Soren Stoutner:
> > I would like to respectfully disagree will some of the opinions expressed 
in
> > this email.
> Hi Soren
> 
> Not sure if we disagree all that much to be honest. :)

Yes, I think we do agree.

From a performance perspective, I see two big problems.

1.  Lintian runs after a potentially long build process.

2.  Lintian takes a long time to run itself.

You have done a really good job of describing point 1 above, as well as 
proposing ways to address it.  I endorse everything you have said.

For point 2, it seems the easiest way to make a significant difference would be 
if lintain could run multi-threaded.

My current development CPU has 8 physical cores hyper-threaded, which present 
to the OS as 16 logical cores.  Most of the build process is multi-threaded 
and uses all the cores to their maximum potential simultaneously.  But lintian 
is single-threaded, so it only uses one core and the other 15 sit idle.  There 
might be some lintian tests that depend on the output of other lintian tests, 
but I would imagine that most of them could be run in parallel with the 
results combined at the end.

I don’t know enough Perl to know how easy it would be to run lintian in a 
multi-threaded manner, but if this was not a difficult change it would speed up 
lintian runs dramatically.  In the case of qtwebengine-opensource-src on my 
hardware, assuming that all cores could be efficiently utilized and there are no 
other bottlenecks in RAM or disk access, it would drop lintian’s runtime from 
about 30 minutes to about 2 minutes.

> > First, I should say that I am painfully aware of how long it takes to run
> > lintian on large packages.  When working on qtwebengine-opensource-src it
> > takes my system (Ryzen 7 5700G) about 2 hours to build the package and
> > about half an hour to run lintian against it. I would be completely in
> > favor of any efforts that could be made in the direction of making lintian
> > more efficient, either within lintian itself or in other packages that
> > replicate some or all of lintain’s functionality in more efficient ways.
> > 
> > However, I personally find lintian to be one of the most helpful tools in
> > Debian packaging. When going through the application process I found
> > lintian to be a very useful tool in helping me learn how to produce
> > packages that conform to Debian’s standards.  The integration of lintian
> > into mentors.debian.net was very helpful to me when I first started
> > submitting packages to Debian, and it is still helpful to me when 
reviewing
> > other people’s packages that have been submitted to mentors.debian.net.
> 
> I agree that lintian has useful features as stated in my original email.
> Though not with a very strong emphasis, so I can see how you might have
> not have given that remark much thought.
> 
> After a bit more reflection, I feel lintian is currently working in
> three different areas (to simplify matters a lot).
> 
>   1) Support on Debian packaging files.
>      - You have a comma in `Architecture`, which is space separated
>      - The `foo` license in `d/copyright` is not defined
>      - The order of the `Files` stanzas are probably wrong.
>      - The `Files` stanza in `d/copyright` reference `foo` but that file
>        is not in the unpacked source tree.
> 
>      => This should *not* require a assembled package to get these
>         results and should provide (near) instant feedback directly
>         in your editor. This area should be designed around interactivity
>         and low latency as a consequence.
> 
>   2) Checking of upstream source.
>      - Missing source checks
>      - Source files with known questionable licenses
>      - Here are some dependencies that might need to be packaged.
>      - The upstream build system seems to be `waf` so you should be
>        aware of this stance in Debian on `waf`, etc.
>      - Maybe: "Advice for how to approach this kind of package".
>        (like "This seems like a python package; consider looking at $TOOL
>        for an initial debianization. The python packaging team might be
>        relevant for you if you are a new maintainer, etc.)
> 
>      => This should *not* require a assembled package to get these
>         results. However, it will take some time to chew through all
>         of this. It would be a "before initial packaging" and maybe
>         on major upstream releases (or NEW checks).  It will be a batch
>         process but maybe with support for interactivity.
> 
> 
>   3) Checking of assembled artifacts.
>      - Does the package place the systemd service in the right place?
>      - There is a trigger for shared libraries but no shared libraries.
>        (etc.)
> 
>      => This (by definition) is for assembled packages. It will be a
>         batch process.
> 
> 
> Part 1) is something I feel would belong in a tool that provides on-line
> / in-editor support (see my post script for details). This is partly why
> expanded `debputy` to into this field. You having a 2½ hour feedback
> loop here is crazy - the `acl2` one having 9+ hours is complete madness.
> 
> Part 2) is ideally something you would run before attempting to package
> a new upstream source tree. Many of these things have a high impact on
> whether you want to continue with the packaging (oh, I need to package
> 20 dependencies first. It has non-free content, etc.). The fact that you
> need to build a package only to discover that your package cannot be
> distributed seems backwards to me. I feel this workflow should work from
> the basis of:
> 
>    $ git clone $UPSTREAM source-dir # (or tar xf ...)
>    $ check-upstream-code source-dir
> 
> Note: This is not an area I am going to tackle. But if I was going into
> it, that would be my "vision" for the starting point.
> 
> Part 3) is where I feel lintian still has an area to be in (which also
> matches its mission statement). It could also include a subset of the
> results from part 1+2 as a "all-in-one-inclusive" wrapping to simplify
> archive-wide QA or sponsoring checks. But as I see it, most
> (non-sponsor) users would ideally get their 1) and 2) feedback from the
> more specialized tools.
> 
> These are the ballparks I would split lintian into given infinite
> developer time and resources. Ideally not a lot "smaller" than this to
> avoid drowning people with the "Run these 1000 tools"-problem to avoid a
> repeat of `check-all-the-things`. This is also why I am not again
> lintian aggregating from the other areas. For some of its users (such as
> sponsors) it will be a useful feature that they can just run one tool
> and get the relevant results.
> 
> > As I type this email I am building an update to qtwebengine-opensource-
src. 
> > So far, lintian has caught two problems with this release that I would 
have
> > otherwise missed.  I admit that I am fairly new as a Debian Developer, and
> > perhaps as I gain greater experience I would get to the point where 
lintian
> > never catches things I miss.  But I don’t personally expect that to ever
> > happen, because there are too many corner cases or opportunities for typos
> > that computers are much better at catching than humans.
> 
> Even with my years of experience I make mistakes that Lintian catches to
> this day. As an example, when I did `debputy`, I had mistakes in
> `d/copyright` with not having the "full text" of the `GPL-2+`. This went
> through NEW so either the FTP masters missed it too or they went "It's
> fine, it is a native package with one contributor and everyone knows
> what GPL-2+ means". Though, my key grief here is that this kind of
> problem should (as said) never come with a 2½ hour feedback cycle.
> 
> Or even a 9+ hour one for acl2 package ...
> 
> To be precise, this kind of feedback belongs in the millisecond to
> second range (seconds for spellchecking of changelogs, exceptionally
> long deb822 files, etc.) in my view.
> 
> > [...]
> > 
> > I must admit that I have been sorely tempted to get involved with
> > maintaining lintian because I feel it is so important.  So far, I have
> > resisted that temptation because I am already involved in a decade-long
> > effort to clean up Qt WebEngine in Debian and get it to the point where it
> > has proper security support.  I haven’t wanted to spread myself too thin
> > and end up accomplishing nothing because I tried to do too much.  But if
> > lintian’s need increases or if my existing commitments decrease I would be
> > happy to find myself involved with lintian maintenance.
> > 
> > Soren
> > 
> > [...]
> 
> If lintian is important to you, I strongly recommend that you do put
> *some* of your volunteer time into it. I have had some 5+ years of
> lintian maintenance since 2011'ish and forward. At its peak, we were at
> most two people actively contributing to lintian and that was not enough
> to keep my motivation going.
> 
> Especially with lintian being in the state it is right now, I think we
> would need several people to stabilize it.
> 
> For me, most of the problems I have are better solved with near instant
> feedback and I do not see lintian ever "getting there". The
> architectural design of lintian has it locked into "high latency
> diagnostics-only feedback with no quick fixes" to reduce it into a
> "one-liner". In my book that should not be a newcomer facing tool.
> Accordingly, I am investing my volunteer time on a different approach to
> scratch my itch.
> 
> Best regards,
> Niels
> 
> 
> -- PS on better newcomer tooling support --
> 
> I also feel we have also failed to look beyond linting for assisting
> newcomers. Batch linting is our go to tool for all problems, but clever
> use of on-line documentation and completion support help newcomers as
> well.  My current example problem is the synopsis part of the
> `Description` field. I recall that as being something many people
> struggled with when I was new to Debian. The lintian tool can spot some
> common cases and go "You probably want to try again", which is not great
> but better than nothing.
> 
> For comparison, in the on-line hover docs feature of `debputy`, I
> special cased the hover docs for the synopsis to provide contextualized
> help. Like "This is how your Synopsis would appear in search results
> from `apt search` or `apt-get search`". It also takes the synopsis and
> inserts into the sentence below
> 
>   This package provides [a|an|the] <SYNOPSIS>.
> 
> Which was the test sentence I learned to use when I started contributing
> on how to write the package synopsis. This feature is available to the
> contributor regardless of whether there is a mistake, which enables them
> to refine their synopsis at all times.
> 
> That is obviously not the kind of tool that lintian is, but that is what
> I feel we should provide as "first line" support tools for newcomers.


-- 
Soren Stoutner
soren@debian.org

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: