[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Potato now stable



On Wed, Aug 23, 2000 at 06:42:39PM +1000, Drake Diedrich wrote:
> > The structure Jason and I have been discussing is slightly different,
> > and has some really nice properties. The archive layout would be
> > something like:
> > 	pool/main/libf/libfoo/libfoo-dev_1.2-1_i386.deb
> > That is:
> > 	* separate pools for each component (main, contrib, and non-free,
> > 	  and presumably non-US/main, non-US/contrib, and non-US/non-free
> > 	  also)
>    The design I was working on would have these all as separate archives
> consulting a common database.

I don't really understand what you mean by this. Can you give a sample
layout? What do you mean by "archive"?

(Here's a glossary as I understand it:
	Component : main, contrib, non-free, etc
	Distribution : stable, woody, potato, etc
	Architecture : binary-i386
)

> Constraints can be (and currently are)
> applied to ensure that there are no conflicting overlaps between archives
> (same path, different checksum). It's not much more complexity in pinstall
> (a few lines in the hash function) to add greater detail, but with
> multiple-archive capabilities the need is less than in the dinstall case.
> There's also no reason two archives can't be on the same machine.  Sharing
> the pool though would be a bit more complex and defeat the point of
> separating them.  pool/main and pool/non-free could both be in the same
> directory though, they'd just need separate incoming directories to feed
> them.

Similarly, I don't really see what you mean by "different paths", or
why different archives might need to be on different machines, or what
you mean by sharing the pool, or why any of this requires different
incoming directories...

> > 	* hashing based on either the first character of the source
> > 	  package, or the first four characters, if it's a library source
> > 	  (to keep it fairly evenly distributed)
>    It's certainly possible to do this (just coded, untested), but I question
> whether it's actually necessary.  A few thousand source directories right in
> pool isn't all that difficult to handle by the filesystem code, especially
> if it's relatively static and cached.

Well, it's mainly so people with ftp clients have *some* chance of
navigating through it all. If we wanted, we could just dump every .deb
ever straight in pub/debian/pool and expect the filesystem to cope with
that too.

>    Separating main/contrib/non-free is problematic still though, as the .dsc
> files do not list a section, and source files are installed in the first
> pass before the .debs are (completely avoids the case where .debs have no
> source).

It's the .changes file dinstall looks at to decide where things go,
though.  So you either get an upload to a "Distribution: non-free",
or the files get marked to go into "non-free/blah".

As far as existing .debs go, all you need to do is look at what directory
they're located in right now.

> Source packages that want to generate both free and non-free
> packages are also problems, but if they're written so that their behavior
> can be controlled at build time to create one or the other then the
> identicle source could be uploaded to both archives under the same hash. The
> only easy alternative to this I can think of is banning free/non-free source
> packages (insisting that they be split or duplicated).  Looking in the .debs
> ahead-of-time to decide where the source goes would be difficult.

Having sources that build non-free and main packages isn't possible: if
the license for the source is DFSG-free, you can build free binaries. If
it's not the source shouldn't go in main in the first place. DFSG-free
sources that build DFSG-free packages, only some of which depend on
non-free software is more plausible, but we're already tending to demand
they be split anyway, AIUI.

> > This means to find a package you need to know five things:
> > 	its name
> > 	the version you want
> > 	the architecture you want it for
> > 	the name of its source package
> > 	the component its in
> > It also means it's trivial to not mirror non-free if you want.
>    Separate archives would also be trivial.  There are a few cases where it
> would be advantageous to mirrors to allow overlaps, such as license
> interpretation changes, dual free/non-free source packages, personal
> non-distributable CDs, ...  2nd Law: it's easier to mix separate things than
> to unmix combined things.

A license interpretation change means the package should *not*
be in the component it was previously in, not that it should
suddenly be in two components. Dual free/non-free sources aren't
reasonable. Non-distributable things can't be packaged.
 
> CREATE TABLE pool (
>         distribution INT4 NOT NULL REFERENCES distribution,
>         deb     INT4 NOT NULL REFERENCES deb,
>         arch    INT4 NOT NULL REFERENCES arch,
>         section TEXT,
>         install TIMESTAMP
> );

This really isn't helpful at all. All the integers don't interest me
at all. What exactly is all this stuff, what are the tables that are
apparently referenced, what are the primary and secondary keys, and why
did you split it like you have? These are the questions that you need
to answer if you want the SQL stuff you've written to be taken seriously.

Having code's all very well, but if all you're going to say beyond that is
"take it or leave it", it's probably just going to be left.

> > It's probably suboptimal to have to have separate incoming queues.
> > The above layout basically just means you have to construct a new
> > "Component:" field for the all-packages-in-the-pool table.
>     Which means new uploads of every package?  This wouldn't be required
> with separate upload queues and separate archives - just don't process the
> wrong set of packages when preloading each archive pool with the old
> archives.

Huh? It's trivial to work out which component each .deb is in right
now: you just look at its path, or the path of the Packages files that
reference it.

> > If experimental .debs are to be included in the pool, the above would
> > probably imply we'd end up with:
> > 	dists/experimental/{main,contrib,non-free}
> > which probably isn't a bad thing.
>    I'm currently including them in the main pool, they just only get listed
> in the dists/experimental/*/Packages.gz files and nowhere else.  It probably
> won't happen often, but experimental packages might occaisionally end up
> being good enough for the stable track.

Experimental packages supposedly operate under the constraint that the
version of the package in unstable is strictly greater than the corresponding
version in unstable, when that's violated, the package should disappear from
experimental.

>    Under the current implementation dists/ still holds all of the old
> dinstalled .debs and sources.  New uploads (and all of the .changes files)
> go to pool/.  New Packages/Sources/Contents files are generated and placed
> in dists/ as well.  Eventually dists/ will be emptied of .debs and sources
> with no flag day.

Yeah, well, that goes without saying. Remirroring the whole archive in
a day isn't reasonable.

Cheers,
aj

-- 
Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. GPG signed mail preferred.

  ``We reject: kings, presidents, and voting.
                 We believe in: rough consensus and working code.''
                                      -- Dave Clark

Attachment: pgpKyOW_1uGub.pgp
Description: PGP signature


Reply to: