Re: RFC: The Future of Solving Dependency Problems in APT (SAT, CUDF)
On Do, 2010-12-23 at 13:47 +0100, David Kalnischkies wrote:
> Ho, ho, ho all :)
>
> On Wed, Dec 22, 2010 at 18:32, Julian Andres Klode <jak@debian.org> wrote:
> > as discussed on IRC; the time is ready for APT to gain some more
> > advanced facilities in the area of solving dependency problems.
>
> (as said on IRC, i am a bit low on time, but some comments anyway)
>
>
> > The first file that I attach is solver.h. This outlines my proposal for
> > a new solver API. It contains the interface of an abstract base class
> > APT::Solver that provides an interface for solving changes, inspired by
> > discussions with some members of the community.
>
> Passing a depCache into an abstract resolver seems a bit strange as
> it is the key part of the internal APT resolver including a lot of stuff which
> will properly not be used by another resolver (okay, aptitude, but picosat?),
> e.g. State of dependencies, autoremove flags and so on.
>
> What we want is exporting only the pkgCache to the resolver to save them
> from the hassle of parsing the Packages files on their own. Resolvers who
> can handle a pkgCache will be happy, others need an intermediate level to
> reformat it to some other format (e.g. {C,D}UDF).
> (thinking of pkgCache as a really big {Package,Version}Set)
>
> Further more, i would like to have a Hold(VersionSet) to tell the resolver
> e.g. about dpkg holds and maybe even a Keep(PackageSet) to tell the
> resolver to never remove this set of packages (but upgrade is fine) which
> could be handy for essentials and co.
We do not really need hold, that's completely identical to marking the
package for installation, although it might be useful just for the
semantics. Does Keep(PackageSet) work with multi-arch where an install
pkg:all becomes architecture-dependent? I merged the changes into the
header.
>
> In return i would expect a set of versions of packages to install and to
> remove.
That's what Solve() gives you.
> Solver::Rules is a mystery for me. Most complains about the resolver in
> APT evolve from the fact that it doesn't switch candidate versions if it
> has to (the popular car v2 with wheels v3 and v2 example comes to mind).
> How to transport the meaning of candidate after all?
> For me, candidate is an (important) implementation detail of the APT resolver.
> Another resolver should be able to choose freely if he wants to put on wheel
> v3 or v2 and therefore doesn't need a concept of candidate version.
I removed all options except for the candidate one (which I switched,
it's IGNORE_CANDIDATE_STATUS now), and the one for ignoring conflicts;
as the latter one is really useful for building Debian images - for
example, debimg has it's own solver mainly in order to ignore conflicts.
>
> I would say that every version i tell the solver about should be a candidate
> for him to choose from -- why should i tell him about versions i don't want
> in the first place after all? So in the end pkgCache is maybe the wrong
> choice and we should give the solver a (big) VersionSet as problem space.
No. VersionSet costs too much space from my experience. That's one
reason why I use mostly sets of pointers in my sat-solver prototype
(made multiple MB difference or something like that) - possibly because
it's a binary tree and allocates empty iterator objects for unused
nodes.
> The problem with these multiple candidates would be that while an experimental
> package can be a candidate, if at all possible stable should be preferred
> which leads us to think about how we exposure pkgPolicy to the resolver world.
> Wasn't in an old thread yet-another-description language mentioned for
> exactly this task?
The current prototype prefers candidate versions of packages over
non-candidate versions. That is a sane choice in my opinion. I chose to
use pkgDepCache because it brings a cache and a GetCandidateVer(), but
we don't really need it and can use:
Solver(pkgCache &cache, pkgPolicy &policy, unsigned int rules = 0);
instead.
>
> > It shall be the base for two new classes: SatSolver and CudfSolver. The
> > first class implements a solver based on the boolean satisfiability
> > problem, the second one an interface to an external solver with data
> > exchange happening via the CUDF format[1]. We'd also be willing to merge
> > aptitude's solver into this new infrastructure if useful.
>
> I would divide Solvers into Internal (i can pass pkgCache/VersionSet directly
> into it) and External (i need to reformat pkgCache/VersionSet) and subclass
> it for {D,C}UDF and maybe other formats of other resolvers.
> After all, how a solver solves isn't really interesting for the implementation
> of an interface to it - and hard to define or in what category belongs
> APT/aptitude?
All solvers need to reformat the problem. The CUDF solver transfers it
to CUDF, the SAT solvers translate everything to CNF. Both takes more
time than solving the problem itself (200 ms for the picosat solver). I
don't think we need to differentiate between Internal and External
solvers within the class hierarchy.
> Beside that a CUDFSolver could be a SATSolver internally (noone forbids
> writing a picosat solver based on cudf data)…
I know, I want to write a CUDF solver using picosat. But a CUDF solver
is named CudfSolver because it talks using CUDF.
> > The second attachment, sat-solver.cc, is a proof of concept stand-alone
> > SAT solver on APT, utilizing the SAT solver 'picosat' for the actual
> > solving process. The version of 'picosat' in Debian is built without
> > trace support, if you link the program against a version with trace
> > support, it can tell which dependencies could not be satisfied (but it
> > requires a bit more memory then, mostly for storing which clause belongs
> > to which dependency).
>
> If i see it correctly your proof of concept doesn't use the depCache -
> beside for the sake of simulating more what APT does - so that would
> be a proof by experiment for my theory above. :)
It uses GetCandidateVer(), but that just forwards the call to pkgPolicy,
so yes, it's useless.
>
> Integrating the proof into apt-get directly should be relatively simple but i
> would like to see the abstraction to be in first so that APT doesn't need to
> be linked against picosat (and all the others which will come maybe).
> Some sort of plugin system maybe…
> The version system (and a bit of other stuff) in APT has some sort of
> built-in plugin system, if we could make that more dynamic…
> (yes, i am dreaming a bit now)
Dynamically loading solvers is a bit more complicated and it's easier to
link it in. We still have external solvers using CUDF for other cases.
>
>
> > I also have a proof of concept CUDF writer written in Python, but
> > decided that we do not really need it now (it only writes the universe
> > to stdout, no request). And yes, pure CUDF, no Debian-specific version
> > thereof.
>
> You forgot in your implementation that Debian has a lot of implicit
> dependencies (or conflicts if you like) as it doesn't allow two different
> version of the same package to be installed (minor, but to consider)
> and i think there were other obstacles mentioned in the older threads
> which make it helpful at least to use a debian specific format if possible,
> but i am not sure and again i can't find it right now.
Self-conflicts are added in lines 93-94. I took care of all things
listed at <http://www.mancoosi.org/cudf/debian/>.
>
>
> > If there are no objections, I will create a new branch, and implement a
> > SatSolver using picosat therein (as that's fairly easy and produces good
> > results). I would also like to create the CudfSolver, but I'd need a
> > solver to test with. If anyone knows such solvers, please write.
>
> Regarding picosat, feel free to go on, i am interested to see what will
> happen and it could "simplify" abstraction if we have two solvers to
> "hide" under a common interface instead of just the APT beast as we
> have only one try to get it right before the front-ends will (ab)use it…
>
> On the other hand, i would like to claim CUDF for me as i am currently
> in the process of setting up a stay at "mancoosi headquarter" around
> April to work on teaching APT how to read/write CUDF and doing
> something meaningful in between. :)
I thought mancoosi ends in February - the website states "The project
has started February 1st, 2008, and will have a duration of 3 years.".
Would you license your solution under the LGPL-2.1+ then, so I can
incorporate it in the APT2 project as well? OK, I already have most of
what I need locally, but I like sharing stuff.
> After all an interesting year 2011 is upcoming, but until then:
> Merry package management days… -uh, i mean Christmas :)
I am attaching the updated proposal. Maybe we should make this whole
class internal as APT::SolverImpl and expose a real class APT::Solver in
the public API that creates an APT::SolverImpl and forwards method calls
to it - that's very efficient in terms of ABI stability, and we would
get a nice constructor like:
Solver(string name, pkgCache cache, pkgPolicy policy);
(replace objects with constant references to them)
--
Julian Andres Klode - Debian Developer, Ubuntu Member
See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.
/* solver.h - Base class for dependency problem solvers in APT.
*
* Copyright (C) 2010 Julian Andres Klode <jak@debian.org>
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation files
* (the "Software"), to deal in the Software without restriction,
* including without limitation the rights to use, copy, modify, merge,
* publish, distribute, sublicense, and/or sell copies of the Software,
* and to permit persons to whom the Software is furnished to do so,
* subject to the following conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
class pkgDepCache;
namespace APT {
class VersionSet;
/**
* \class APT::Solver
* \brief Abstract base class for new-style dependency problem solvers.
*
* This base class defines the interface for the new dependency problem
* solvers introduced in APT 0.9. It works by preparing sets of changes,
* passing the set of changes to the solver, and then calling Solve() to
* solve the problem.
*/
class Solver {
public:
/**
* \enum Flag
* \brief Flags that change the behavior of solvers.
*
* These flags remove constraints from the solver and may allow
* it to produce (better) solutions under some circumstances.
*/
enum Flag {
/**
* \brief Allow the installation of non-candidates by default.
*
* By default, only candidates are installed automatically, all
* other solutions require a confirmation. This flag allows the
* solver to choose non-candidate versions automatically without
* prior confirmation.
*/
IGNORE_CANDIDATE_STATUS = (1 << 0),
/**
* \brief Ignore dependencies like "Breaks" or "Conflicts".
*
* This flag is useful for building sets of packages that are
* distributed together, such as installation discs.
*/
IGNORE_NEGATIVE_DEPENDENCIES = (1 << 4)
};
/**
* \brief Creates new solver.
*
* Creates a new solver. Please note that cache should remain the
* same for the lifetime of the solver, as the solver may cache
* stuff, such as the conversion from APT to its native format.
*
* \warning Only one instance of Solver may exist at the same
* time.
* \param cache The cache describing the package universe.
* \param policy The policy describing package pins.
*/
Solver(const pkgCache &cache, const pkgPolicy &policy);
/**
* \brief Gets the flags for the solver.
*
* \return The flags, an OR combination of APT::Solver::Flag values.
*/
unsigned int GetFlags() const;
/**
* \brief Sets the flags for the solver.
*
* \param flags An OR combination of values of the APT::Solver::Flags enum.
*/
void SetFlags(unsigned int flags);
/**
* \brief Marks all versions in the set for installation.
*
* Marks the given version for installation if they are not
* installed, or tells the solver to prevent their removal
* if they are installed.
*
* \param install A set of versions that shall be installed.
*/
virtual bool Install(const VersionSet &install) = 0;
/**
* \brief Marks all versions in the set for removal.
*
* Marks the given version for removal if they are installed,
* or tells the solver to prevent their installation if they
* are not installed.
*
* \param remove A set of versions that shall not be installed.
*/
virtual bool Remove(const VersionSet &remove) = 0;
/**
* \brief Prevents removal of the given packages.
*
* Prevents the removal of the given packages, and is thus similar
* to calling Install() on the installed versions, with the exception
* that upgrades are still allowed.
*
* \param keep A set of packages that shall not be removed.
*/
virtual bool Keep(const PackageSet &keep) = 0;
/**
* \brief Prevents changes of the given package versions.
*
* Causes the solver to not change the given package versions, and
* is thus likely to do the same thing as calling Install() on all
* installed versions and Remove() on all not-installed versions in
* the set.
*
* \param hold A set of versions that shall not be changed.
*/
virtual bool Hold(const VersionSet &hold) = 0;
/**
* \brief Tells the solver to mark all upgradable packages for upgrade.
*
* Marks every installed package for which a newer candidate version
* exists for upgrade. To upgrade only specific packages, pass their
* candidate versions to Install().
*/
virtual bool Upgrade() = 0;
/**
* \brief Finds a solution to the problem and returns the needed changes.
*
* Finds a solution to the problem if possible, and then returns such
* solution in the sets given to the function by reference. After the
* return, the solver removes all knowledge about the jobs and may be
* used again, as if it was freshly created.
*
* This method may also be used to run an incremental solving process,
* by passing the result of install to Install() and the result of
* remove to Remove(), doing some more changes and then calling
* Solve() again. Doing this on external solvers is not recommended,
* as this way of operation is too slow in that case.
*
* \param install A reference to a set that will contain all
* versions that shall be installed on the system.
* \param remove A reference to a set that will contain all
* versions that shall be removed from the system.
* \param automatic A reference to a set that will be a subset of
* the union of install and remove listing only
* those changes that are not requested by the user.
*/
virtual bool Solve(VersionSet &install, VersionSet &remove,
VersionSet &automatic) = 0;
protected:
/**
* \brief The cache containing the universe
*
* This cache defines the current status of the system ("the universe")
* and is read by the solver.
*/
const pkgCache &depcache;
/**
* \brief Package pinning
*
* This policy reference allows the solver to respect the pins
* the user set.
*/
const pkgPolicy &policy;
/** \brief An OR combination of values from APT::Solver::Flag. */
unsigned int flags;
};
}
Reply to: