[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: architecture-specific release criteria - requalification needed



On Tue, Sep 20, 2005 at 11:41:13PM +0200, Andreas Barth wrote:

> Now, looking more into details, the criteria are:
> |  * Availability:
> |     The architecture needs to be available for everybody, i.e.
> The reason for this should be obvious

The requirement of "available as new" has been dropped? Good. 

> |     it must be available without NDAs and
> Same for this. We're about free software.
> |     it must be possible to buy machines on the market.
> The reason should be obvious: Our users should be able to use the
> architecture.

And it doesn't matter if you can buy that piece of hardware in shops or on
ebay? Good.

> |  * Developer availability: The architecture must have a
> |    developer-available (i.e. debian.org) machine that contains the
> |    usual development chroots (at least stable, testing, unstable).
> This criterion is there so that any developer can actually find out what the
> issue is if his package fails to work on a specific architecture.  Of
> course, when adding a new architecture, there will be a time without a
> stable release, and there will be some special arrangement how such a
> machine can be provided without having even some packages in testing.  But
> that's not meant as a no-go, as long as we are quite optimistic that adding
> the new machine will actually work in time.

Well, for established ports, that shouldn't be a big deal, right?
For new ports this could be a chicken/egg problem, but you mentioned "some
special arrangements", so I guess, that's no problem as well. 

> |  * Users: The architecture needs to prove that developers and users
> |    are actually using it. Five Developers needs to certify in that
> |    they're actively developing on this architecture, and it needs to
> |    be demonstrated that at least 50 users are using the platform. We
> |    are counting users, not machines; e.g., one s390-installation
> |    with 50,000 users fullfils the user criterion just fine.
> As already discussed multiple times, the "50 users" really means "50
> individuals using that architecture".  Both criteria are there to make
> sure that an architecture gets just enough usage so that
> architecture-specific bugs are found in time.

Although it was discussed several times, I have still no idea how those
users should be counted? 
Who has to show those numbers? The users? The porters?
Is it intended that a user should mail debian-release to say "Hi! I'm using
port X!"? I doubt so. 
So, a little more info is needed, how those numbers are counted. 
Are users just meant to be ordinary users of does that included developers?
Are developers meant to be DDs or are other developers count as well? 
F.e. Roman Zippel on m68k does a lot of development work and even hinted
Wouter Verhelst to a solution for the gcc 4.0 ICE bugs, but he's not a DD. 

> |  * Installer: The architecture must have a working, tested installer.
> Obviously, we need an installer. Though that doesn't say "debian
> installer", we think that our users expect that there are not too many
> different ways for them to install the released version of Debian etch one
> day.

Some obscure bootloader and a tarball of a mini-installation would be fine
as well?

> |  * Porters and Upstream support: There is support by the porters and
> |    upstream. This is especially true for the toolchain and the
> |    kernel.
> Obviously, we cannot keep a port alive if there is nobody doing support for
> it.  Of course, it is quite possible that Debian and upstream support is
> done by the same persons.  And our experiences with support of gcc-4.0
> on m68k have shown that it is possible to get such issues fixed, if the
> porters are notified in time and are really interested in their port (and
> if there are enough porters).

Uh, well, I hope that slower archs will be given a large time frame to fix
things than faster archs? It would be unfair to give just a week time to fix
a problem, when the recompilation of the package would take 10 days,
wouldn't it? ;)

> |  * Archive coverage: The architecture needs to have successfully
> |    compiled the current version of the overwhelming part of the
> |    archive, excluding architecture-specific packages.
> Our back-of-the-envelope number for this criterion is 98%.  As pointed
> out multiple times during recent discussions, we don't have a good way
> to measure an architecture's compliance with this yet, but we'll work on
> figuring that out; of course we will exclude hardware-specific packages and
> buggy optional/extra packages with severe portability issues, but
> porters must take responsibility for working with maintainers to fix
> portability issues.

I still believe this definition is far too strict (without being precise).
You can't say, you have to be 98% uptodate without saying what you
understand by "being uptodate". As already outlined during the last
discussion: when all m68k buildds are building package, that can easily be
more than 110 packages marked as building and therefore missing as installed
(given a total of 5500 packages). 
Currently m68k has ~650 packages listed that are not in state Installed (203
Needs-Build, 142 Building, 180 Failed, 123 Dep-Wait (+ 5 Failed-Removed + 25
Not-For-Us)). That's roughly 6% of all packages. 
And when does that percent mark need to be reached? After freeze or at any
time before a release?

> |  * Autobuilder support: 
> |    The architecture is able to keep up with unstable
> This is obviously needed. If the architecture cannot keep up, there is no
> way to support it in a stable release.

This is current policy as well. 

> |    with not more than two buildds,
> That is one of the most discussed criteria. As mentioned previously [2],
> there is a nontrivial cost to each buildd, which increases super-linearly;
> there have been cases in the past where this resulted in ports with many
> autobuilders slacking when updates were necessary.

*sigh*

> When reviewing the past however, m68k as the architecture with the most
> autobuilders isn't performing too bad regarding the availability of the
> autobuilders.  So, there is the chance for m68k to get grandfathered in
> for this clause.  However, we expect that they explain why the higher
> numbers of buildds they use are not as bad increasing the maintenance
> overhead.

I think (and believe that many DDs will agree) that m68k, although being one
of the slowest archs, is one of the most responsive ports within Debian and
that having that many buildds is nothing negative at all. 
The m68k ports autobuilder infrastructure is highly redundant and robust to
failures. Its secret lies in its team driven effort. Whereas other ports
might be handled by just a single or few persons, the m68k buildd admins are
a highly cooperative and communicative team. Therefore there's no big
maintenance overhead. Every buildd admin is only responsible for a small
number of buildds, but can easily jump in, if a buildd admin is on holiday
or for other reasons absent. 

I think the good performance of m68k (when it is not currently hit by some
gcc ICEs ;) if proof enough why it is good to have a "high number of buildd,
large number of buildd admins"-approach.

> |    has redundancy in the autobuilder network,
> This is the "it needs to have N+1 buildds" - just in case some buildd has
> hardware failures or whatever else.  History told the release team that
> redundancy is really necessary.  No one in the release team wants to be 
> in the position of tracking where a box in Europe is just located, and
> proding some developer in that country to pick the box up, because that
> box has become the largest blocker of the next stable release.

This surprises me mostly. 
The history has already shown that n+1 redundancy might not be enough as
well. For example mips was hit but that last year. There were 3 buildds that
were broken at that time. 
And when you mention redundancy, why don't you mention a redundancy of
buildd admins as well? History has shown this problem more than once very
clearly as well. 
No offense, but when you speak about redundancy of buildd hosts, you should
also think about redundancy of buildd admins. It doesn't make much sense to
have n+1 redundancy on machines when the only person that can do the work is
ill and can't do his/her work.

> |    keeps their autobuilders running for 24x7,
> Of course, autobuilders can have hardware maintainence.  But the
> autobuilders need to be able to run 24x7, and the need to be generally
> up all the time (and thanks to the redundancy above, there should always
> be an autobuilder currently running).

The more buildds, the higher the chances that at least one buildd is up and
running. 24/7 is just normal for a buildd. 
Sometimes it seems problematic to find a porting machine, though, because
db.debian.org/machines.cgi is not always very uptodate. 

> |    has autobuilders acceptable for security support.
> If we want to do security support, that of course needs to be there.

Again, is it required that those security buildds are different machines
from normal buildds or how should that be handled? And for what dists is
this required? For stable security, testing-security, etch-secure?

> 
> So, of course the question is:  Which of the current architectures do
> fulfill this set of requirements?  To get this answer, and because we
> know there are currently architectures which do *not* meet the
> requirements, all architectures will need to be requalified.

How shall this requalification be handled? Who has to be contacted to make a
requalification?

> We added an overview page about the release criteria on
> http://release.debian.org/etch_arch_criteria.html and on the
> requalifying on http://release.debian.org/etch_arch_qualify.html.
> 
> Porters, please feel free to prove us compliance of your architecture with
> the remaining issues (that are all as of today :) - or rather, please start
> on this, as this needs to be done soon.  We will follow up when we made some
> progress, so that you all see what parts and architectures are more
> problematic before any decision is done.  We hope to finish this in the
> next two months.

Ok, let's have a look at http://release.debian.org/etch_arch_qualify.html:

For m68k: 
- availability: yes, still available 
- devs available: yes
- 5 devs (porters): think so, yes: cts, adconrad, smarenka, smurf, rnhodek,
wouter, schmitz, younie
- 50 users: guess so, but need to be counted somehow... 
- installer: yes
- porters & upstream support: yes
- archive coverage: well, could be better atm due to toolchain and
transition problems
- buildds keep up with unstable: usually yes
- buildd N<=2: no, not without crosscompiling, requesting exception
- buildd redundancy: best redundancy of all :)
- buildd security: think so, yes
- buildd 24/7: of course

> And of course, as always: If there are any issues that you think aren't
> addressed properly, please feel free to contact
> debian-release@lists.debian.org - but remember: that address is not a
> discussion list, so if you rather want to discuss, select a more
> appropriate place for that.

Done. Limiting to d-d... 


-- 
Ciao...                //        Fon: 0381-2744150 
      Ingo           \X/         SIP: 2744150@sipgate.de

gpg pubkey: http://www.juergensmann.de/ij/public_key.asc

Attachment: signature.asc
Description: Digital signature


Reply to: