[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: (ab)using autopkgtest for benchmarking



Hello Christian,

Christian Kastner [2020-05-01 22:09 +0200]:
> I would like to initial a discussion on -devel on the possible merits of
> packages shipping benchmark tests, specifically in the form of autopkgtests.

It's of course fine to ship them in the autopkgtest format, and that's useful
to re-use the machinery of it, and maintainer's knowledge of how to run them.

> Such tests being beyond the scope of autopkgtest, I would like to
> solicit the maintainers' feedback before doing so.
> 
> Specifically, I would propose using the following new restrictions
> (unknown to autopkgtest, and therefore skipped by default):
> 
>    benchmark-task       A typical task for which the package is used
>    benchmark-io         An I/O intensive task
>    benchmark-network    A task that requires connectivity

That's the bit that I'd strongly recommend against. Declaring these as a
restriction doesn't fit the spirit of restrictions defining testbed
capabilities [1]. Something like "benchmark-io" is at the same time too generic
(what does it mean or guarantee exactly from the point of the runner, which has
to check for and provide this?) and too specific (what if I need to benchmark
slightly related, but not identical things, like graphics I/O performance?)

These cannot be well-defined from autopkgtest's specification, and adding these
to your package will mean that they will just cause these tests to be skipped
on the CI infra. Maybe that's what you want, but it seems that most benchmark
tests should at least be able to *run* on our CI without failing. Of course the
numbers they produce are not very useful.

If you want to go into the "Restrictions" direction to prevent running these on
CI, what comes to my mind is to generalize that to an "isolation-exclusive-hw"
(bad name, I know) that means that the test can only be run on real iron and
thus avoid noisy neighbors and virtualization jitter. This can then be used
with e. g. the ssh runner's --capability option, and run on developer's infra,
or maybe one day in Debian we get some CI lab for these purposes.

An orthogonal approach is to use these benchmark-* labels as "Feature:"s. Then
they will run just normally in CI, but if someone has some machinery that is
suited for doing networking tests, they can look for benchmark-network feature
tests and run them.

Martin


[1] I know that recent additions like "flaky" or "superficial" also don't match
this definition. I am also convinced that it was wrong to introduce them, these
should have been features, not restrictions.


Reply to: