Re: RFC: implementation of package pools

To: Anthony Towns <aj@azure.humbug.org.au>
Cc: Debian Developers <debian-devel@lists.debian.org>
Subject: Re: RFC: implementation of package pools
From: Eray Ozkural <erayo@cs.bilkent.edu.tr>
Date: Mon, 23 Oct 2000 20:36:46 +0300
Message-id: <39F4772E.315B462A@cs.bilkent.edu.tr>
Reply-to: erayo@cs.bilkent.edu.tr
References: <Pine.LNX.3.96.1001019133353.7396G-100000@wakko> <39EF5892.7AFCFB25@cs.bilkent.edu.tr> <20001020065938.D26097@azure.humbug.org.au> <39EF77B4.CD8D8468@cs.bilkent.edu.tr> <20001020090633.E26097@azure.humbug.org.au> <39F0A266.D515F875@cs.bilkent.edu.tr> <20001021072049.A11953@azure.humbug.org.au> <39F29AC7.489DB304@cs.bilkent.edu.tr> <20001023110318.B9170@azure.humbug.org.au> <39F3B9E4.A59D4A9E@cs.bilkent.edu.tr> <20001023170228.A12686@azure.humbug.org.au>

Anthony Towns wrote:
> 
> On Mon, Oct 23, 2000 at 07:09:08AM +0300, Eray Ozkural wrote:
> > No, I'm saying "does it support automatic upload and removal?"
> 
> Exactly to the extent that's desired, yes it does. As discussed in the
> second last mail of mine in this thread.

Excellent. Then people surely won't wait for 1 or 2 weeks before their
new packages appear in the archive.

> > it. You could at least tell me about the algorithm that you used;
> > did you do random-restart hill climbing, or do you resort to a heuristics?
> 
> The first thing you'll need to note is that it's *not* the SAT
> problem. There's a trivial "installable" situation that satisfies all the
> "givens": don't try to install anything. Once you install one thing, that
> invalidates a bunch of other statements, which it then tries to correct.
> 

Yeah, I noticed that. Also in many cases it seemed to me that the
sentence becomes something simple like
[notation: ! is negation, /\ conjunction, \/ disjunction ]

P /\ Q /\ !R .... /\ Z

(I know you don't like my ASCII art, but no logic symbols on the keyboard)

which is just conjunction of propositional symbols or their negations.

In this case, it's already known which values the symbols would take.

In many other cases we have simple conjunctive forms of the sort

(P \/ Q) /\ (S \/ !T) /\ X

in which there's little or no overlap of propositional symbols among
conjunctions. [makes the instance quite easy to solve]

and correct me if I'm wrong, but the propositional sentences are
unrestricted CNF (conjunctive normal form), so it really is SAT.
the sentences in the database might be easy instances on the other hand.

> It's largely brute force, with a few heuristics to make things
> quicker. But mostly there's just a large degree of care taken in the
> backtracking.
> 

Yea, I think you're using some stack of list-of-unexpanded notes.
When such things are done iteratively, it becomes a bit confusing
for a reviewer. I personally don't understand iterative backtracking
code I wrote some time ago. :) The preprocessor macros make it harder
to read, too. :) I now have some trust in optimizing compilers, and simply
write anything recursive really recursive.

> It's unknown what cases it's actually exponential for, if it's reasonable
> to expect those cases to occur within Debian, or if they can be easily
> avoided. It's fast enough for the moment (it process a weeks updates in a
> few minutes, and all the updates since potato released in half an hour or
> so), but that's mainly because the machine it's running on is obscenely
> fast, and it also seems significantly slower than it was in June.
> 

I see. Most of the sentences are easy instances anyway.

> > May I kindly request you to tell me whether this search is complete?
> > And does it perform good because average |clauses|/|proposition_symbols|
> > is ... low? I'd be pleased if you can point out what to check next.
> 
> Well, think about it: in the *general* case, most packages have fairly
> straightforward dependencies, and conflicts tend not to come into play
> too much.
> 
> The main speed problem is simply that there are 5000 packages on each
> of six different architectures, all of which interdepend to a greater
> or lesser extent. That's up to 30k SAT problems that need to be checked
> every time a new package is checked for inclusion in testing, and there
> are usually a dozen of those a day [0], plus any leftovers of which
> there are often lots.
> 

Hmm, but I'm not asking this. 

I agree that most of the instances are fairly trivial on the other hand.

So, I'm asking you this: does this algorithm converge to an exhaustive
search over all possible valuations of propositional letters if the
heuristics don't work well? Does it guarantee finding a solution?
[ie, you can write a search algorithm for this problem that hasn't
a good memory and keeps running in circles without ever halting]

A professor summarized how a complete A* algorithm should work like
 "Suppose that your heuristic is too optimistic and believes that
the solution is in the next node to expand, in that case A* would
converge to BFS or DFS that are complete." Something like that, I'd
liked this simple explanation a lot when I was confused about a
heuristic.

This would be the only current problem with the code if it isn't
complete.

If you're using a heuristic, there's the possibility of getting stuck
at local minima, which you should jump out of with some technique.
For this problem, random-restart is okay. 

> I benchmarked it some time ago, which is why it works at an adequate speed
> at the moment. IIRC, I had problems getting a profiler to see through
> the perl harness I'm using, so I haven't any recent, or particularly
> accurate data.
> 

I think that's fair enough. So it won't have serious performance flaws
for some time. If it does, then we can implement GSAT which might work
better for this problem. [Aren't we fascinated with NP-complete problems
all around us? :(]

> One thing we definitely favour is less talk, more action.
> 
> You can keep thinking up nifty projects 'til the cows come home, but no
> one really benefits from them until someone gets around to implementing
> them. And I can assure you that everyone else on this list is too busy
> with their own projects to magically do something you've thought up and
> talked about.
> 

No, I'm after a concrete thing for debian. I'll survey ontology and
text categorization methods, and then implement a set of tools that
make it easier to deal with categorization of text repositories such as
debian package descriptions. I plan to have a working version in 2 months.

This is what I would like to see most in debian, so I believe I have the
motivation for it.

> No, you've got the most well commented, up to date version right there in
> your browser. But for someone well versed in algorithmic theory, and who's
> happy to rewrite dpkg at the drop of a hat, it shouldn't be a bother.
> 

All right. I'm not bothered, I looked at the code even though I missed
my class today (I slept over it :) I'll be glad to help with anything,
if I may.

Thanks,

-- 
Eray (exa) Ozkural
Comp. Sci. Dept., Bilkent University, Ankara
e-mail: erayo@cs.bilkent.edu.tr
www: http://www.cs.bilkent.edu.tr/~erayo

Reply to:

Follow-Ups:
- Re: RFC: implementation of package pools
  - From: William Lee Irwin III <wli@holomorphy.com>
- Re: RFC: implementation of package pools
  - From: Anthony Towns <aj@azure.humbug.org.au>

References:
- Re: RFC: implementation of package pools
  - From: Jason Gunthorpe <jgg@ualberta.ca>
- Re: RFC: implementation of package pools
  - From: Eray Ozkural <erayo@cs.bilkent.edu.tr>
- Re: RFC: implementation of package pools
  - From: Anthony Towns <aj@azure.humbug.org.au>
- Re: RFC: implementation of package pools
  - From: Eray Ozkural <erayo@cs.bilkent.edu.tr>
- Re: RFC: implementation of package pools
  - From: Anthony Towns <aj@azure.humbug.org.au>
- Re: RFC: implementation of package pools
  - From: Eray Ozkural <erayo@cs.bilkent.edu.tr>
- Re: RFC: implementation of package pools
  - From: Anthony Towns <aj@azure.humbug.org.au>
- Re: RFC: implementation of package pools
  - From: Eray Ozkural <erayo@cs.bilkent.edu.tr>
- Re: RFC: implementation of package pools
  - From: Anthony Towns <aj@azure.humbug.org.au>
- Re: RFC: implementation of package pools
  - From: Eray Ozkural <erayo@cs.bilkent.edu.tr>
- Re: RFC: implementation of package pools
  - From: Anthony Towns <aj@azure.humbug.org.au>

Prev by Date: Re: Proposed: task-secure-system package
Next by Date: Re: Books to teach myself 'programming infrastructure'
Previous by thread: Re: RFC: implementation of package pools
Next by thread: Re: RFC: implementation of package pools
Index(es):
- Date
- Thread