Re: Bug#93612: Support for new archive structure
[Sorry for the terrible delay. It's just that I'm completely overloaded and
also try to do my thesis work at the same time, so I'm only now catching up
with the older discussions.
Actually, I had scheduled some time yesterday to spend on this, and was
planning to write some large and fundamental document from various
perspectives, but then the DHCP server of my cable ISP broke down and whatever
I tried on my end I couldn't get any packet through. So much for my planning.
Maybe I'll do the "big work" later (if ever..), for now just one aspect.]
On Sat, 14 Apr 2001, Jason Gunthorpe wrote:
> On Sat, 14 Apr 2001, J.A. Bezemer wrote:
> > Generalizing: I think the autentication sequence should be like this for
> > _any_ medium (cdrom,file,ftp,http,whatever):
> >
> > Release/Release.gpg
> > |
> > Packages.complete
> > |
> > Packages
> > |
> > .deb file
> >
> > with the special case that Packages.complete may be omitted if it is
> > exactly the same as Packages. I know this is a "radically" different way to
>
> Yes, it will wave it's magic wand and determine that Packages.complete and
> Packages are in fact, the same.
I think I didn't make myself very clear. There's no magic involved at all.
On the Debian FTP mirrors, we have the following files, just like we have
now:
Release(.gpg)
Packages
*.deb
On CDs, we have
Release(.gpg)
Packages.complete (or whatever name)
Packages
*.deb
But instead of making the CDs a special case, we consider the FTP sites a
special case (but in a more general sense). What I mean is we have (i.e. APT
has) this pseudo-code for _all_ access methods:
rel_fd = open("Release");
relgpg_fd = open("Release.gpg");
if (rel_fd == -1 || relgpg_fd == -1) { sorry, can't check; exit(-1) }
if (!gpg_check(rel_fd, relgpg_fd)) { check failed!; exit(-1) }
pack_fd = open("Packages");
packcompl_fd = open("Packages.complete");
if (packcompl_fd == -1)
packcompl_fd = pack_fd; /* Magic? Don't think so.. */
if (pack_fd == -1) { sorry, no packages; exit(-1) }
if (!check_md5(packcompl_fd, rel_fd)) { check failed!; exit(-1) }
if (!genuine_subset(pack_fd, packcompl_fd)) { check failed!; exit(-1) }
add_to_available_packages(pack_fd);
With open() substituted by the method's specific file access function, like
download_via_HTTP_to_/var/state/apt/lists/_and_open() for the HTTP
method.
So it's the responsibility of the CD builder (or partial-mirror maintainer)
to have a Packages.complete where needed; APT will seamlessly ignore it if not
present, just like it does now with the small binary-*/Release files.
Is this description clear enough?
If so, are there any reasons why this scheme wouldn't work?
The genuine_subset() is very fast and efficient when Packages is sorted
the same as Packages.complete (which is nicely sorted alphabetically now).
Below are two small shell scripts, one to build the Packages file given the
Packages.complete and a list of packages to be included, and the other
actually implementing genuine_subset(). They already work nicely (and can
also be useful on non-Debian systems), but perl or C versions will be much
much faster.
> If they aren't then it does some equally
> magical thing that has all kinds of oppurtunity to introduce flaws from
> the user's perspective - 'It is listed in Packages but APT wont use it!!
> Its a bug!!!'
In the scheme I'm describing, the Packages file only lists packages that are
actually available (just like it does now). Always. No exceptions.
(=> apt-cdrom/apt-get update do _not_ have to stat every .deb file)
It's the Packages.complete that _may_ (if present) list packages that are not
available. Okay, this might be confusing to the end user, but we need to have
that file anyway because we want to authenticate it, right? But another name
might reduce confusion, like
Packages.of.which.only.a.subset.is.available.here
Regards,
Anne Bezemer
---------------------- make-Packages-subset
#! /bin/sh
# USAGE: make-Packages-subset list < Packages-all > Packages
#
# To be run during CD image creation instead of dpkg-scanpackages
#
# Takes about 1 min 45 sec on K6/450 for realistic list
IFS=" "
while read -r PACKHEADER PACKNAME ; do
if [ "$PACKHEADER" != "Package:" ] ; then
echo "Missing Package: header in $1, exiting..." 1>&2
exit 1
fi
if grep -q "$PACKNAME" "$1" ; then
# Package must be included, copy up to next empty line
PLINE="$PACKHEADER $PACKNAME"
while [ -n "$PLINE" ] ; do
echo "$PLINE"
IFS="" read -r PLINE
done
echo ""
else
# Package must be omitted, skip up to next empty line
PLINE="whatever"
while [ -n "$PLINE" ] ; do
IFS="" read -r PLINE
done
fi
done
----------------------
---------------------- check-Packages-subset
#! /bin/sh
# USAGE: check-Packages-subset Packages Packages-all
#
# To be run anytime user wants to.
#
# Takes about 25 sec on K6/450 for realistic example
IFS=""
(
# Read next package in Packages
while read -r PACKFIELD ; do
# Search package in Packages-all
PACKFIELDALL=""
while [ "$PACKFIELD" != "$PACKFIELDALL" ] ; do
if ! read -r PACKFIELDALL <&3 ; then
echo "Error: extra package in $1 ($PACKFIELD)" 1>&2
exit 1
fi
if [ "$PACKFIELD" != "$PACKFIELDALL" ] ; then
# Skip other fields
PLINEALL="whatever"
while [ -n "$PLINEALL" ] ; do
read -r PLINEALL <&3
done
fi
done
# Both files now at same package, Package: header checked already.
# Check description line by line
PLINE=something
PLINEALL=else
while [ -n "$PLINE" -o -n "$PLINEALL" ] ; do
read -r PLINE
read -r PLINEALL <&3
if [ "$PLINE" != "$PLINEALL" ] ; then
echo "Error: mismatch in $1 ($PACKFIELD)" 1>&2
exit 1
fi
done
# Both files now at empty line after package
done
echo "Packages file $1 is a genuine subset of $2."
exit 0
) <"$1" 3<"$2"
----------------------
Reply to: