Re: Bug#93612: Support for new archive structure

To: Jason Gunthorpe <jgg@debian.org>
Cc: debian-cd@lists.debian.org, Anthony Towns <aj@azure.humbug.org.au>
Subject: Re: Bug#93612: Support for new archive structure
From: "J.A. Bezemer" <costar@panic.et.tudelft.nl>
Date: Wed, 9 May 2001 01:14:41 +0200 (CEST)
Message-id: <[🔎] Pine.LNX.3.96.1010509000346.13079B-100000@panic.et.tudelft.nl>
In-reply-to: <Pine.LNX.3.96.1010414140715.414B-100000@wakko.deltatee.com>

[Sorry for the terrible delay. It's just that I'm completely overloaded and
also try to do my thesis work at the same time, so I'm only now catching up
with the older discussions.
Actually, I had scheduled some time yesterday to spend on this, and was
planning to write some large and fundamental document from various
perspectives, but then the DHCP server of my cable ISP broke down and whatever
I tried on my end I couldn't get any packet through. So much for my planning.
Maybe I'll do the "big work" later (if ever..), for now just one aspect.]

On Sat, 14 Apr 2001, Jason Gunthorpe wrote:
> On Sat, 14 Apr 2001, J.A. Bezemer wrote:

> > Generalizing: I think the autentication sequence should be like this for
> > _any_ medium (cdrom,file,ftp,http,whatever):
> > 
> >   Release/Release.gpg
> >     |
> >   Packages.complete
> >     |
> >   Packages
> >     |
> >   .deb file
> > 
> > with the special case that Packages.complete may be omitted if it is
> > exactly the same as Packages. I know this is a "radically" different way to
> 
> Yes, it will wave it's magic wand and determine that Packages.complete and
> Packages are in fact, the same.

I think I didn't make myself very clear. There's no magic involved at all.
On the Debian FTP mirrors, we have the following files, just like we have
now:
  Release(.gpg)
  Packages
  *.deb

On CDs, we have
  Release(.gpg)
  Packages.complete   (or whatever name)
  Packages
  *.deb

But instead of making the CDs a special case, we consider the FTP sites a
special case (but in a more general sense). What I mean is we have (i.e. APT
has) this pseudo-code for _all_ access methods: 

  rel_fd = open("Release");
  relgpg_fd = open("Release.gpg");
  if (rel_fd == -1 || relgpg_fd == -1) { sorry, can't check; exit(-1) }

  if (!gpg_check(rel_fd, relgpg_fd)) { check failed!; exit(-1) }

  pack_fd = open("Packages");
  packcompl_fd = open("Packages.complete");
  if (packcompl_fd == -1)
    packcompl_fd = pack_fd;   /* Magic? Don't think so.. */
  if (pack_fd == -1) { sorry, no packages; exit(-1) }

  if (!check_md5(packcompl_fd, rel_fd)) { check failed!; exit(-1) }

  if (!genuine_subset(pack_fd, packcompl_fd)) { check failed!; exit(-1) }

  add_to_available_packages(pack_fd);

With open() substituted by the method's specific file access function, like
download_via_HTTP_to_/var/state/apt/lists/_and_open() for the HTTP
method.

So it's the responsibility of the CD builder (or partial-mirror maintainer)
to have a Packages.complete where needed; APT will seamlessly ignore it if not
present, just like it does now with the small binary-*/Release files.

Is this description clear enough?
If so, are there any reasons why this scheme wouldn't work?

The genuine_subset() is very fast and efficient when Packages is sorted
the same as Packages.complete (which is nicely sorted alphabetically now).
Below are two small shell scripts, one to build the Packages file given the
Packages.complete and a list of packages to be included, and the other
actually implementing genuine_subset(). They already work nicely (and can
also be useful on non-Debian systems), but perl or C versions will be much
much faster.

>    If they aren't then it does some equally
> magical thing that has all kinds of oppurtunity to introduce flaws from
> the user's perspective - 'It is listed in Packages but APT wont use it!! 
> Its a bug!!!'

In the scheme I'm describing, the Packages file only lists packages that are
actually available (just like it does now). Always. No exceptions.
(=> apt-cdrom/apt-get update do _not_ have to stat every .deb file)

It's the Packages.complete that _may_ (if present) list packages that are not
available. Okay, this might be confusing to the end user, but we need to have
that file anyway because we want to authenticate it, right? But another name
might reduce confusion, like
Packages.of.which.only.a.subset.is.available.here

Regards,
  Anne Bezemer

---------------------- make-Packages-subset
#! /bin/sh

# USAGE:  make-Packages-subset list < Packages-all > Packages
#
# To be run during CD image creation instead of dpkg-scanpackages
#
# Takes about 1 min 45 sec on K6/450 for realistic list

IFS=" "

while read -r PACKHEADER PACKNAME ; do
  if [ "$PACKHEADER" != "Package:" ] ; then
    echo "Missing Package: header in $1, exiting..." 1>&2
    exit 1
  fi

  if grep -q "$PACKNAME" "$1" ; then
    # Package must be included, copy up to next empty line
    PLINE="$PACKHEADER $PACKNAME"
    while [ -n "$PLINE" ] ; do
      echo "$PLINE"
      IFS="" read -r PLINE
    done
    echo ""

  else
    # Package must be omitted, skip up to next empty line
    PLINE="whatever"
    while [ -n "$PLINE" ] ; do
      IFS="" read -r PLINE
    done
  fi

done
----------------------
---------------------- check-Packages-subset
#! /bin/sh

# USAGE:  check-Packages-subset Packages Packages-all
#
# To be run anytime user wants to.
#
# Takes about 25 sec on K6/450 for realistic example

IFS=""

(
  # Read next package in Packages
  while read -r PACKFIELD ; do

    # Search package in Packages-all
    PACKFIELDALL=""
    while [ "$PACKFIELD" != "$PACKFIELDALL" ] ; do
      if ! read -r PACKFIELDALL <&3 ; then
        echo "Error: extra package in $1 ($PACKFIELD)" 1>&2
        exit 1
      fi
      if [ "$PACKFIELD" != "$PACKFIELDALL" ] ; then
        # Skip other fields
        PLINEALL="whatever"
        while [ -n "$PLINEALL" ] ; do
          read -r PLINEALL <&3
        done
      fi
    done

    # Both files now at same package, Package: header checked already.

    # Check description line by line
    PLINE=something
    PLINEALL=else
    while [ -n "$PLINE" -o -n "$PLINEALL" ] ; do
      read -r PLINE
      read -r PLINEALL <&3

      if [ "$PLINE" != "$PLINEALL" ] ; then
        echo "Error: mismatch in $1 ($PACKFIELD)" 1>&2
        exit 1
      fi
    done

    # Both files now at empty line after package

  done

  echo "Packages file $1 is a genuine subset of $2."
  exit 0
) <"$1" 3<"$2"
----------------------

Reply to:

Prev by Date: Woody cd for PPC...where?
Next by Date: PB de mot de passe
Previous by thread: Re: Woody cd for PPC...where?
Next by thread: PB de mot de passe
Index(es):
- Date
- Thread