[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC interaction with external dependency solver: "APT" state



[ sorry for the delay ]

On Wed, May 19, 2010 at 07:57:08PM +0200, David Kalnischkies wrote:
> 2010/5/19 Stefano Zacchiroli <zack@debian.org>:
> > I see a solution that still preserves the "one pass" approach, not
> > adding any need of and ping-pong between package manager and solver. The
> > idea is that in the answer from the solver, you get package stanzas as
> > follows:
> >
> >  package: foo
> >  version: 1.2.3-4
> >  id: adgf31452135hkashdfa
> >  installed: yes
> 
> So let us assume the user has installed <awesome,1,Packages-1>.
> In the next request, which id will <awesome,1,Packages-1> have
> compared to <awesome,1,status-file>?
> Are these versions merged or not?

Right, this is the question :) The underlying problem related to this is
that AFAICT dpkg does not preserve any kind of information on the
*origin* of packages that get installed. Is that correct? It seems to me
that the proper solution of this would be to have an ID in the dpkg
metadata database which can be cross-reference with APT lists. Not that
I think we should/can fix that for the purposes being discussed here,
but I dwell a bit more in the analysis just to be sure that I hit the
nail.

I guess that for packages coming from APT, apt can in principle invoke
dpkg with an extra cmdline argument specifying, for each package, its
MD5 sum (or equivalent), telling dpkg to store it. That would solve 99%
of the occurrences of this problem, I guess. Packages without an
assigned checksum would be, at worst, as they are now, i.e. not possibly
cross-referenceable to their external origin. Are you aware of any bug
report on dpkg about that? (I've skimmed through the list w/o finding
anything apparently relevant thus far.)


If this is the case, the only solution is that id are completely
determined by the available package metadata (and in particular by those
metadata that will land into /var/lib/dpkg/status); that is the only way
we can later on recognize an installed package as coming from a given
package list. A hash, as I understand is currently implemented by APT
(no wonder :)), is a particular case of that.

> > The idea is that "id" is an optional property (defaulting to "") and
> > that the triple <package, version, id> uniquely identify a package for
> > APT; i.e.: it will be able to discriminate among multi-arch,
> > locally-rebuilt packages, and packages coming from different APT lists.
> 
> Just to be sure, we still talk about all package managers in this
> thread here - or is it really about APT alone? I tried to be relatively
> generic until now… - you later say "friends", this could mean
> rev-depends like aptitude and co. - but also smart/apt2/cupt/…
> I at least hope we are not deadly enemies, but friends… ;)

Yes, absolutely, I'm still talking about all such friends, even though
I'm bit worried too by the lack of reaction of others :-)

> Also, i don't see why multi-arch is in the same list as locally-rebuilt
> packages. The are completely different problems:

Yes, they are, but from the point of view of how uniquely identifying
packages, the ID scheme I've been proposing addresses both scenarios. I
didn't mean to imply anything more than that.

> While the Packages files have checksums of the deb files the status file
> has not. As i said already APT tries to "fix" this by hashing installsize and
> the list of dependencies, so a version in APT is <numberstring, hashvalue>
> full info e.g. in #574956 and #574072 in which you can also see what
> happens if two version merger disagree (in this case human brain vs. APT)
> as well as this situations are not completely academic…

So, it seems to me that a sane solution would be to:

1) accept that the world is evil, and use a hash-based solution as the
   sole trustworthy package unique id (still in "my" sense though,
   i.e. package ids will be triple <name, version, id>; then, in
   addition, you'll have the side effect that id is "your" hash which
   incidentally is enough to uniquely identify the package)

2) fix a way, which should be shared by solvers interested in using the
   external solver API, to compute that hash out of the metadata
   available in APT lists (which AFAICT are anyhow a common ground for
   all concerned package managers)

Would you consider that acceptable and not too constraining wrt the
actual implementation of the hashing?

If yes, when receiving the output of the solver, you will just project
it to a list of package hashes that the solver tells you must be
installed in a satisfactory solution.  Out of that, I presume the solver
will already have a lookup function from hashes to actual packages.


Note that this does not mandate the use of hashes, if other package
managers trust enough the pair name/version, they can simply ignore the
id.

Cheers.

-- 
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime

Attachment: signature.asc
Description: Digital signature


Reply to: