[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Why is only the latest unstable package considered for testing?



Bjorn Stenberg wrote:
> Bob Proulx wrote:
> > Just minor searching through the archive turned these up with relevent
> > discussion. 
> 
> These posts, as your reply in debian-testing, concern packages that are not
> Valid Candidates.

But they did concern how testing operates.  Insight into the design of
the pipeline of unstable, testing and stable.  There has been a lot of
discussion in general about how to improve testing to get to a release
faster.  Isn't that what you are trying to do here too?  Improve
testing so that we can generate high quality releases more quickly?

> My question concerns perfectly working packages that are suitable
> for testing, yet are never considered.

Okay, let me give it another try.

> I have been looking at the "excuses" page[0] and was struck by how
> very old some packages are in testing, yet only the very latest
> bleeding edge version from unstable appears to be considered for
> inclusion.

The version in unstable is not _supposed_ to be bleeding.  It
frequently is but it is not designed to be bleeding.  Packages there
are supposed to be staging for testing which is staging for release.
If a maintainer thinks it is bleeding then it should not even go in
unstable.  However since sometimes you can't tell if the package has a
problem until it meets the world and you have to start somewhere then
unstable is the best place to start.

> Am I misunderstanding something, or does this approach "punish"
> projects that adhere to the Open Source motto to release often?

Often is a good thing for research and development but bad for
integration and testing.  Therefore at least some amount of stability
must be had in order to ensure it has made it through minimal
testing.  

The 10 days to wait in unstable is very much less than "often" and
still provides much opportunity for rapid updates.  A project that
released monthly, for example, should have no trouble at all with the
10 day cooking period in unstable.  That would still be considered to
be updated often.  You are not "punished" unless your release cycle is
less than the waiting period.  And it is not really punishment.  Just
an impedance mismatch.

The reason there is a waiting period is to give the package some
exposure.  Other packages will build against it.  People will get a
chance to try it out.  How much time is enough to ensure adequate
testing?  If it is a very popular package then a couple of days is
enough.  But if it is obscure then perhaps even several months is
really not enough.  But surely some non-zero amount of exposure time
is always required.

If you are releasing really often, say the daily cvs snapshot build,
then those are probably not suitable for zillions of people to be
installing daily.  Small ripples in the upstream current can create
huge waves crashing against the far shore accompanied by inland
flooding.  That type of release is really more suitable for people who
have adopted the package as one that they care about and are willing
to find and fix bugs in.  I will claim right here and now that there
are different types of "releases" and while the daily build is a very
useful one it is a different type and for a different purpose than a
full distribution release.  And of course there is a whole range of
sizes in between.  You have to know your target audience and match
things appropriately.

> Hypothetical example:
> 
> Project X makes an effort to prepare a solid release, squashing all RC
> bugs and making sure each target builds flawlessly. They pack it up,
> label it "3.0" or whatever and release it. The package goes into
> unstable and, being a non-critical update, needs 10 days to become a
> "Valid Candidate"[1] for testing.
> 
> For a while, people have been working on a big patch to move project X
> from gnome to gnome2. This was submitted to the project but was
> delayed until after 3.0. Now that 3.0 is out the door and the users
> have a stable version to work with, the gnome2 patch goes in and a new
> version, 3.1, is released only a few days later. This version is not a
> valid testing candidate, since gnome2 is not yet included in testing,
> but it's a welcome update for those running gnome2/unstable.
> 
> Now the catch: Since the testing scripts only consider the latest
> unstable version, the testing-ready 3.0 version is never
> considered. Instead, the 3.1 version is rejected (due to depending on
> gnome2) and the old 2.0 version is kept.

The DD who updated the 3.0 version into unstable should work to make
sure it enters testing.  Until it does they should not upload a less
stable version.  They should drive the process and not let the process
drive them.

Since they spent a significant amount of time in the first paragraph
preparing the package and squashing all bugs and getting it to build
on all platforms they should not throw that work away and overwrite it
with a less well tested or possibly more buggy version or they will
have to start the process again.  Just because upstream released
another package does not mean that it must go immediately into
unstable.  They may need to manage the project.

> Is there a good reason for this? Would it not be better to track all
> versions of a package and include the latest (if any) that fulfills
> all requirements? It seems to me that the current system leaves
> testing with older versions than necessary.

Mike Stone wrote:
> So who's testing the older version that's superceeded in unstable
> and not present in testing?

And one of the criteria is noted by this.  It must be available and in
use for some period of time.  Once it removed from unstable it is no
longer available.  It is not longer a Valid Candidate for moving into
testing.

Bjorn Stenberg wrote:
> Good point.
> 
> Is the consideration history of a package recorded anywhere?
> 
> As an example, take curl[0]. The testing version is 7.9.5-1, released
> 2002-03-13. Since then, 13 versions have been released. The latest is not
> considered since it depends on gcc-3.2, but is there any way to find out why
> the previous 12 versions were not considered?
> 
> [0] http://packages.qa.debian.org/c/curl.html

If *any* snapshot was a Valid Candidate then that effectively
eliminates the waiting period entirely.  A package could be accepted
into unstable, immediately replaced with another, and the first could
still move into testing.  That creates a problem.

Why was it replaced immediately?  Perhaps the maintainer found a bad
bug late and caught it and updated immediately.  Now that first
package has a bad bug.  But since no one will ever see it there will
be no bug reports against it.  Since it was replaced so quickly no
other packages will ever be built against it in unstable.  There will
be no testing whatsoever.  But by the records it would look good.  No
bugs, 10 days, so move it into testing, where suddenly the problems
become apparent and testing sees the bug.  That would be bad.  How
would you prevent this problem case?

Bob

Attachment: pgpiSIaWMIm7r.pgp
Description: PGP signature


Reply to: