[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: architecture-specific release criteria - requalification needed

On Thu, Sep 22, 2005 at 02:15:57PM +0200, Ingo Juergensmann wrote:

> > > I still believe this definition is far too strict (without being precise).
> > > You can't say, you have to be 98% uptodate without saying what you
> > > understand by "being uptodate". As already outlined during the last
> > > discussion: when all m68k buildds are building package, that can easily be
> > > more than 110 packages marked as building and therefore missing as installed
> > > (given a total of 5500 packages). 
> > Once again, what I'm hearing here is a plea for latitude towards certain
> > ports because they're slow, instead of an acknowledgement that being slow
> > (and therefore, failing to keep up) is what causes extra work for the
> > release team.

> No. Looking at s390, which ought to be no slow arch, there are currently 90
> packages in Needs-Build, 116 in Building, 6 Failed, 39 Dep-Wait, 1
> Failed-Removed and 28 Not-For-Us. So, a total of 251 (+29) packages marked
> as not Installed. That's >2% as well. 

Sure it is.  Do you think it matters *how* the arch falls below 98%?  The
impact on the release team is the same, whether it's due to the architecture
being slow, or due to unprocessed build failures and unsigned .changes files
(s390 seems to actually have plenty of both of these right now).

> That's why I asked how this number will be obtained. 

Grab the total number of source packages; exclude those packages which
should be excluded for the architecture (ideally, this can be done using
P-a-s only); count how many of these packages don't have binary packages
matching the current source version in unstable.  Once this has been done
for all archs and we have some historical data, calibrate our cutoff.

Whether this causes porters to evaluate the speed of their buildd hardware,
the availability of their buildd admins, or the amount of attention they pay
to build failures on their archs, it's a win for release predictability
because it means the release team doesn't have to be in the business of
micromanaging ports.

> > > Currently m68k has ~650 packages listed that are not in state Installed (203
> > > Needs-Build, 142 Building, 180 Failed, 123 Dep-Wait (+ 5 Failed-Removed + 25
> > > Not-For-Us)). That's roughly 6% of all packages. 
> > Yes, and a week ago the m68k porter lists were informed that the current
> > state of m68k is unacceptable and that if significant progress wasn't made
> > soon, we would ask for m68k to be ignored for all testing propagation.

> Yes, and although there are at least two buildds that are not running
> currently (because the local admin was very busy in the past) and even one
> of them has a broken boot disk, the port is keeping up fairly well now. 

Ok, fair enough; but AIUI the rebound happened when new buildds were brought
on-line, so the port *didn't* have excess buildd capacity beforehand, and
still doesn't until that local admin is available again.

> > If an architecture fails the release candidate criteria for whatever
> > reason, and is demoted to a non-release arch, I believe the sensible course
> > of action is to give the porters a fixed two-month period to remedy the
> > lapses before being re-evaluated by the release team.  That leaves the
> > porters free to focus on fixing whatever the issues are instead of scurrying
> > to get re-qualified ASAP, and it also ensures the release team's time isn't
> > wasted re-approving a port which qualifies at the instant but immediately
> > becomes a liability again after being approved.

> When a port isn't keeping up, it's already free to decide for the release
> team to release that port or not.

Is it?  Then why are we having this conversation about whether 98% is a
proper line to draw for "not keeping up"?

The reality is that historically, there have been several occasions where
I've felt that one architecture or another has been behind to the point that
it was making it harder to do release work, but I haven't been comfortable
bouncing the arch due to lack of clear precedent.  This is as much about
setting expectations as it is about anything else, so that we don't have to
have a flamewar (or people holding grudges) any time an arch gets ignored by
britney.  (Instead, we front-load the flamewars and grudges in the interest
of efficiency.)

> > graphics/gem_1:0.90.0-17: Dep-Wait by buildd_m68k-kiivi [optional:out-of-date]
> >   Dependencies: libjack0.80.0-0 (>= 0.99.0)
> >   Previous state was Building until 2005 Aug 10 08:44:01
> > This is a sampling of screwy dep-waits I was able to find by glancing
> > through the buildd.debian.org webpages, and excludes those that I've already
> > specifically asked the m68k porters to remove in the recent past because
> > they were holding up transitions.

> So, what is better: to set a dep-wait and maybe do something wrong or
> setting no dep-wait at all and let the package in Building state for weeks? 

Which is more likely to get attention from a porter who happens to be in a
position to correctly diagnose the build failure:  a package which is listed
as "Building" with a failure build log to be looked at, or a package which
has already had its build log processed and marked as Dep-Wait?

Maybe the issue is that neither gets much attention; maybe that's why wrong
Dep-Waits happen in the first place. ;)  Then again, maybe the m68k buildd
maintainers do have time to periodically review stale dep-waits that they've
set, to check them for correctness; that would be a pleasant surprise.
Either way, it's only an issue for *me* when I notice it before someone else
does. :)  And I know it takes me longer to notice a wrong dep-wait than it
takes me to notice a maybe-failed package that could be requeued.

Anyway, this isn't end-of-the-world stuff, it's just a simple observation
about how having more people involved does bring a corresponding cost of
team coordination.

Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
vorlon@debian.org                                   http://www.debian.org/

Attachment: signature.asc
Description: Digital signature

Reply to: