Re: non-free/contrib policy
On 21 Jul 1997, Rob Browning wrote:
> 7. Huge symlink farms can be trouble for people trying to use
> mirror to capture a portion of the distribution (say
> main/source, contrib/source, non-free/source, and
> non-US/source). AFAIK the way you set up mirror to do this
> flattens all the symlinks which could (if files are symlinked to
> multiple locations) result in a large number of duplicate files.
> We need a smarter mirror that can generate symlinks if and only
> if one copy of the file is already in (or about to be in) the
> mirror tree.
Firstly, I infer from your mention of main/source, contrib/source,
etc. that you are envisioning a different directory tree structure
than I had tried to describe. I was thinking of something like this:
/pub/debian # or somewhere -- the distribution root
/codename # e.g., hamm
/packages # real files -- __all__ the packages
/source, /binary-i386, etc.
/main # symlink farm -- Official Debian only
/source, /binary-i386, etc. # symlinks into ../../packages
/main+contrib # symlink farm -- Official Debian & contrib
/source, /binary-i386, etc. # symlinks into ../../packages
/main+noex-us # symlink farm -- OD & noex-us
/source, /binary-i386, etc. # symlinks into ../../packages
/main+noex-us+contrib # symlink farm -- OD & noex-us & contrib
/source, /binary-i386, etc. # symlinks into ../../packages
/main+contrib+noex-us+non-free # another symlink farm
/source, /binary-i386, etc. # symlinks into ../../packages
etc.
Given this, if I understand you correctly (I'm not a mirror user. I'm
taking "flatten the symlinks" to mean downloading the target file itself
instead of duplicating the link), the sources for main and contrib
could be obtained by mirroring those links from the main+contrib
tree and flattening the links.
What you describe, as I take it, would replace those symlink farms with:
/main # symlink farm -- Official Debian only
/contrib # symlink farm -- contrib only
/noex-us # symlink farm -- noex-us only
/noex-dutch # symlink farm -- noex-dutch only
/non-free # symlink farm -- non-free only
etc.
That's more like the present structure. It also requires smaller
symlink farms, and probably fewer of them as well (one per
"area" rather than one per downloadable combination of same.)
Downloaders would then download all the separate areas they wanted.
As you point out, though, It would seem to be be necessary to avoid
categorizing packages so that they fell into more than one area to
avoid duplicating downloaded files. With this restriction, this
scheme just duplicates the current division by directory in simlink
farms. There's probably little benefit to be had from doing this.
However, that restriction against duplications is a problem. We
currently avoid duplications between non-us and main,non-free,contrib
by saying (1) main,non-free,contrib include no non-us packages; and
(2) non-us makes no determination about whether packages located
there would fall into the main, contrib,or non-free areas. That would
be difficult to do between, for example, noex-us and noex-dutch. Both
the governments of the U.S. and the Netherlands could conceivably
decide to prohibit the export of crypto sortware. The Netherlands
could also conceivably prohibit the export of packages which we would
otherwise want to place in the main distribution (e.g., gcc). (The
Netherlands being used here as an example. It could happen elsewhere
too.)
I think the scheme I suggested deals better with such situations
than the current scheme. However, even having suggested it, I'm
not really happy with it myself. Two objections to it which did
not occur to me when I suggested it would be:
8. It's inflexible, requiring central determination of what
distribution configurations are to be offered (e.g.,
main+noex-US+contrib, main+noex-dutch+contrib, ...).
9. It would need __lots__ of symlinks and, consequently, __lots__
of inodes on the ftp sites and mirrors.
/cdrom/bo/binary-all: 232
/cdrom/bo/binary-i386: 1035
/cdrom/bo/disks-i386: 34
/cdrom/bo/msdos-i386: 1034
/cdrom/bo/source: 2233
Leaving out the current msdos-i386 symlink farm, that's 3534
nodes in this one main distribution tree.
I think both of these are pretty serious objections.
I think I got carried away with my multi-point suggestion. It tried
to do too much. I think I'd like to break it down into smaller parts:
A. Points 1-3. Put the primary distribution site outside
the U.S., where export restrictions are not an issue.
Put all uploaded packages on that site, and feed the world
from there. Local mirrors might mirror all the packages,
or some subset.
B. Point 4. Add a required control file "Distribution" field,
providing a way for maintainers to target uploads at particular
distributions (e.g., bo, hamm; or possibly release-1.3,
release-2.0).
C. Point 5. Add an optional control file field to specify
(restricting?) characteristics of a package. I called this
"area" earlier, after the current areas such as "contrib" and
"non-free". However, I now think this field should be called
something else.
Perhaps this would be the "Flaws" field which someone suggested
earlier. This field would contain keywords having agreed and
documented meanings (e.g., "non-DFSG" for DFSG violators,
"noex-dutch" for Netherlands export-restricted, possibly
"orphan" for orphaned packages, etc). This would allow
maintainers to flag policy-defined "Flaws". Packages could
then be placed appropriately on the primary FTP site based on
these maintainer-flagged "Flaws". The "Flaws" information
would also be reported by `dpkg --info` and `dpkg --status`.
D. Point 5 (The second one, I misnumbered) and 6. Restructure
the geometry of the package tree(s) and simlink farm(s).
(but first need to agree on the objectives of the restructuring,
and agree on a plan meeting these objectives).
I think A is ripe for current consideration. B and C are probably
ripe for discussion. D needs more thought.
--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org .
Trouble? e-mail to templin@bucknell.edu .
Reply to: