Re: ROCm CI: Scheduling for experimental active again

To: Christian Kastner <ckk@debian.org>, Debian ROCm Team <debian-ai@lists.debian.org>
Subject: Re: ROCm CI: Scheduling for experimental active again
From: Cordell Bloor <cgmb@slerp.xyz>
Date: Wed, 10 Jan 2024 13:35:39 -0700
Message-id: <[🔎] 5c9b1154-8d2f-5e2f-2e7f-094a5fa97353@slerp.xyz>
In-reply-to: <[🔎] 6bc6ffcc-ad85-449e-ac52-9565be9b9a3a@debian.org>
References: <[🔎] 6bc6ffcc-ad85-449e-ac52-9565be9b9a3a@debian.org>

Hi Christian,

On 2024-01-06 14:22, Christian Kastner wrote:

I just released a new version of the scheduler that should now correctly
schedule tests in experimental. The scheduler was run, and new jobs have
alreadyT been submitted.

In particular, the following will now trigger tests:
   * Upload of foo_1.0~exp1 to experimental will trigger tests in an
     unstable+experimental environment, with foo pinned to foo_1.0~exp1.
     This means
     -> foo_1.0~exp1 is tested, with all dependencies from unstable
        (or experimental, if they don't yet exist in unstable)
     -> All reverse dependencies of foo_1.0~exp1 are tested with their
        versions from unstable
   * With bar_1.0 in unstable, upload of one of its dependencies (say,
     gcc-13) to experimental will trigger a test of bar_1.0 in unstable
     with the dependency pinned to experimental

Curiously enough, this required a massive overhaul of the scheduler. The
scheduler was conceived as a very rough prototype, and had grown
organically to the point where implementing the above was just not
feasible due to some early design decisions.

This version also fixes a subtle bug [1] that could lead to the wrong
version of a dependency triggering a test, e.g. gcc-13 triggering for
bookworm (gcc-13 does not exist there). Oddly, it seems that this hasn't
manifested itself in an obvious way so far.

This has been going for a few days and I must say that it is an enormousimprovement. This is already starting to help in making the decisions asto whether a package is suitable to be migrated from experimental tounstable.

Finally, I removed tracking linux-signed-amd64 from all tests. This
triggered tests on uploads of linux-signed-amd64, but that didn't make
sense because tests would need to wait for the new kernel to become
active in the testbed, which would frequently be too late even for daily
rebuilt QEMU worker images. I'll have to think of a new process for that.


Sounds good.

There are other ways we might want to prune the triggers as well. Seemto be using the package build dependencies as the basis to determine ifthe tests are triggered, but the root of the dependency walking shouldreally be the binary dependencies of the autopkgtest.

For example, we're rerunning the rocsparse autopkgtest for a newexperimental version of kmod. This is because kmod is a dependency ofrocminfo and rocminfo is a dependency of hipcc. However, hipcc is only abuild dependency of rocsparse-test, not a runtime dependency. So, whenusing a pre-built copy of rocsparse-test as we do in the autopkgtest,the updated kmod cannot affect the outcome of the test, and the test isredundant.

I could imagine that at some point we might add an autopkgtest that doesdepend on hipcc, but it would be nice if the triggering of eachTest-Command could be filtered based on the Depends tree for thatcommand. I think it would dramatically reduce the CI utilization withoutaffecting the CI coverage, which will be necessary as we continue toscale up the number of packages we're testing.


Sincerely,
Cory Bloor

Reply to:

Follow-Ups:
- Re: ROCm CI: Scheduling for experimental active again
  - From: Christian Kastner <ckk@debian.org>
- Re: ROCm CI: Scheduling for experimental active again
  - From: Christian Kastner <ckk@debian.org>

References:
- ROCm CI: Scheduling for experimental active again
  - From: Christian Kastner <ckk@debian.org>

Prev by Date: Generic GFX ISAs in Code Object v6
Next by Date: Re: ROCm bump to 5.7 - rocm-smi-lib uploaded
Previous by thread: ROCm CI: Scheduling for experimental active again
Next by thread: Re: ROCm CI: Scheduling for experimental active again
Index(es):
- Date
- Thread