[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1092155: release.debian.org: Proposal for arch-qualification rule: Limit max build time to X hours



Santiago Vila:
El 5/1/25 a las 14:04, Niels Thykier escribió:
Package: release.debian.org
Severity: wishlist
X-Debbugs-Cc: niels@thykier.net

Hi,

I just file a bug for the build times of riscv possibly being too slow (#1092153). Thinking about it in a more general view, I think it would make sense for Debian to require build times on buildds to have an upper limit for an architecture to qualify. My concrete proposal would be something like 48 hours.

I have not validated if all architectures can meet the proposed 48 hour limit, so take that proposal with some amount of salting. Nevertheless, this metric should be fairly easy to extract automatically from the buildd/wanna-build database.

As for the rationale, slow build times beyond 48 hours affect Debian in ways such as:

  * Delays deployment of security fixes in all suites.

  * Delays testing migration (and thereby RC bug fixes) for packages with
    autopkgtests. After 5 days of build-time, even packages without
    autopkgtests get delayed.

If I was an user of a "slow" architecture, I would much prefer to have delayed security fixes than not having security fixes at all.


Indeed, this is indeed a trade-off.

Framing my request in a different way, we have numerous explicit and implicit criteria (or expectations) for a release architecture. Current the expectations to build-times are not explicit. However, there are implicit expectations to them.

To expand a bit on my delay testing migration delay example from above:

   The testing migration bounty feature (currently, autopkgtests) is
   built around motivating maintainers to raise the QA of the package
   in exchange for faster migration times. However, if you cannot get
   faster migration times because the build takes 5 days, then the
   incentive is lost. So here, there is an implication of the build
   taking at most 2 days for a full benefit with degrading returns
   up to 5 days where we finally get 0 benefit.

To me, it is clear that slow build times has negative impact on the project in various ways. What I was hoping to discuss here is not a boolean "Do we accept slow build times?" but rather a "How slow build times do we allow in general before it becomes a problem?"

No matter what range/limit we pick, there will be a negative impact for some. However, at least we have an established baseline, so we do not have to have the discussion about whether X is too slow.

We have had past discussions about the speed of various architectures. I believe mips(el) and armel have been common sources of frustration due to their slowness for certain packages. So this is by no means a new problem.

(For the arm case, it was the arm architecture without a floating point unit. I think that was armel, but I honestly do not remember.)

Regarding testing migration, remember that we started with 10 days.


I feel it is a stretch to use 10 days as a baseline for this discussion. First, it is close to 13 years ago that 5 days became the default (the "medium" urgency being default):

  https://lists.debian.org/debian-devel-announce/2013/11/msg00007.html

Secondly, we have started to use even faster migrations as a motivation for improving certain types of QA (as mentioned above), where we get diminishing returns past the 2nd day mark.

Therefore, in my view, the discussion baseline should be in the area of 2-5 days.

 > I don't think that
waiting 7 days for a migration in some exceptional cases is a big problem.


I agree exception cases are exceptional and should not count.

Lets take my recent riscv related bug #1092153.

 * The 13 day gcc-14 was exceptional. The maintainer had disabled
   parallel builds to debug something and we only had a few builds
   with this problem. (Concretely, 1. I am using a "few" here to
   signal a few builds might be ok)

 * However, the average build time for gcc-14 seems to consistently
   hit the 5-7 day range with a few occasional outliers for successful
   builds. Therefore, I do not feel that is an exceptional in its
   current state.

With that in mind, I would say that riscv is currently in a 5-7 build time for its worst package and that is *not* an exceptional case.

Maybe the criteria needs a bit more spicing with packages like gcc-X and llvm-toolchain-X having a slightly higher bar. They are not common targets for security bugs, the RT already have special rules for them (such as toolchain freezes, etc.). In that sense, we might consider those packages "exceptional", where we are willing to accept slower build times.

But currently, my focus is taking about the general baseline and that can be limited to "the average package". I think finding a set of exceptional packages and what their concrete baseline should be left to after we agree on the general baseline and we start looking into the implementation phase. As in, I would be fine phrasing this discussion as:

  What is the max build time for an "average package" on a given
  architecture before either the package or the porters need to
  investigate the problem?

  With "average package", we mean any package that the RT has not
  given a special exemption for a higher build time due any criteria
  they have chosen to make this limit implementable in practice for the
  general case without pushing an unreasonable burden on to porters and
  package maintainers.



For the normal "keeping up" cases, I assume that slow architectures should have
more autobuilders available to compensate for their slowness.

So, I hope we can think of other solutions before dropping an architecture for being "slow".


Again, my goal was to make the expectation explicit and establish it as a architecture qualification criteria.

Historically, existing architectures have been given leeway to "catch up" when a architecture qualification criteria change. I see no reason why this proposal would be any different from that de facto process.

Related, if I had still been a member of the RT, the update that the riscv maintainers gave on #1092153 would have been sufficient for me to support a temporary exemption for riscv to this rule for Trixie.

(Bcc:Niels, please reply to the bug address).

Thanks.


Thanks for the (B)CC. I am not subscribed neither to the bug nor to the list, so that was much appreciated.

Best regards,
Niels

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


Reply to: