Bug#645558: Improve britney2 auto-hinter

To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: Bug#645558: Improve britney2 auto-hinter
From: Niels Thykier <niels@thykier.net>
Date: Mon, 17 Oct 2011 01:14:34 +0200
Message-id: <20111016231434.16493.67594.reportbug@mikazuki.thykier.net>
Reply-to: Niels Thykier <niels@thykier.net>, 645558@bugs.debian.org

Package: release.debian.org
Severity: wishlist
User: release.debian.org@packages.debian.org
Usertags: britney

Hi,

I have written a patch to improve britney2's auto-hinter.  I noticed that britney2 does not quite cope with complex cases.

The tree-circle-dependencies-huge-graph-no-hint[1] test case demonstrates this issue.  The current britney cannot solve this if the "DEPTH" and "WIDTH" (in hooks/post-setup) are greater than 3.  With the patch it solved the the 20x20 variant[2].

The basic algorithm for the auto-hinter in this patch is to transform the graph of valid candidates into a DAG[3].  Using the DAG it will calculate which components must migrate together.

The runtime complexity of this algorithm should be in O(|V| + |E|)[4], where |V| is the number of validate candidates and |E| is the dependencies between them (vertexes and edges in the graph, respectively).

~Niels

[1] http://release.debian.org/~nthykier/britney-tests/t/tree-circle-dependencies-huge-graph-no-hint/

[2] In all my tests DEPTH and WIDTH were equal to each other.

[3] Using Strongly Connected Components, see

http://en.wikipedia.org/wiki/Directed_acyclic_graph#Relation_to_other_kinds_of_graphs

[4] I am assuming (on average) constant time hash-look up, append to lists, linear time iteration over hashes.  This should be a valid assumption. :)


-- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (1, 'experimental')
Architecture: i386 (x86_64)

Kernel: Linux 3.0.0-1-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_DK.UTF-8, LC_CTYPE=en_DK.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

>From c4f7c8b28cd0e1aa0bef74bf6dda749894b4cb05 Mon Sep 17 00:00:00 2001
From: Niels Thykier <niels@thykier.net>
Date: Sun, 16 Oct 2011 20:02:18 +0200
Subject: [PATCH] Improve auto-hinter to handle harder cases

The auto-hinter will first generate a directed acycle graph (DAG) out
of the valid candidates[1].  Using the DAG the auto-hinter will then
calculate which of the components must migrate together and generate
easy hints for those.

With this patch, britney2[2] can solve the "huge-graph" test case[3].

[1] Any graph can be condensated into a DAG by calculating the
strongly connected components and looking at the egdes between these
components.  See:

http://en.wikipedia.org/wiki/Directed_acyclic_graph#Relation_to_other_kinds_of_graphs

[2] Using --auto-hinter (and --compatiable)

[3] tree-circle-dependencies-huge-graph-no-hint
---
 britney.py |  137 +++++++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 84 insertions(+), 53 deletions(-)

diff --git a/britney.py b/britney.py
index 351651a..822604f 100755
--- a/britney.py
+++ b/britney.py
@@ -2840,60 +2840,91 @@ class Britney:
         """
         self.__log("> Processing hints from the auto hinter", type="I")
 
-        # consider only excuses which are valid candidates
-        excuses = dict([(x.name, x) for x in self.excuses if x.name in self.upgrade_me])
-
-        def find_related(e, hint, circular_first=False):
-            if e not in excuses:
-                return False
-            excuse = excuses[e]
-            if e in self.sources['testing'] and self.sources['testing'][e][VERSION] == excuse.ver[1]:
-                return True
-            if not circular_first:
-                hint[e] = excuse.ver[1]
-            if len(excuse.deps) == 0:
-                return hint
-            for p in excuse.deps:
-                if p in hint: continue
-                if not find_related(p, hint):
-                    return False
-            return hint
-
-        # loop on them
-        candidates = []
-        for e in excuses:
-            excuse = excuses[e]
-            if e in self.sources['testing'] and self.sources['testing'][e][VERSION] == excuse.ver[1]:
-               continue
-            if len(excuse.deps) > 0:
-                hint = find_related(e, {}, True)
-                if isinstance(hint, dict) and e in hint and hint not in candidates:
-                    candidates.append(hint.items())
-            else:
-                items = [ (e, excuse.ver[1]) ]
-                for item, ver in items:
-                    # excuses which depend on "item" or are depended on by it
-                    items.extend( [ (x, excuses[x].ver[1]) for x in excuses if \
-                       (item in excuses[x].deps or x in excuses[item].deps) \
-                       and (x, excuses[x].ver[1]) not in items ] )
-                if len(items) > 1:
-                    candidates.append(items)
-
-        to_skip = []
-        for i in range(len(candidates)):
-            for j in range(i+1, len(candidates)):
-                if i in to_skip or j in to_skip:
-                    # we already know this list isn't interesting
+        candidates = dict([(x.name, x) for x in self.excuses if x.name in self.upgrade_me])
+        low = {}
+        stack = []
+        components = []
+        commap = {}
+        comdeps = {}
+        comrdeps = {}
+        hints = []
+        roots = []
+
+        # First we generate the strongly connected components in the
+        # graph using Tarjan's algorithm.  (Tarjan's is order O(|V| + |E|))
+        #
+        def visit (e, candidate):
+            if e in low: return
+            num = len(low)
+            low[e] = num
+            stack_pos = len(stack)
+            stack.append(e)
+
+            for suc in candidate.deps:
+                if not suc in candidates:
                     continue
-                elif frozenset(candidates[i]) >= frozenset(candidates[j]):
-                    # j is a subset of i; ignore it
-                    to_skip.append(j)
-                elif frozenset(candidates[i]) <= frozenset(candidates[j]):
-                    # i is a subset of j; ignore it
-                    to_skip.append(i)
-        for i in range(len(candidates)):
-            if i not in to_skip:
-                self.do_hint("easy", "autohinter", candidates[i])
+                visit(suc, candidates[suc])
+                low[e] = min(low[e], low[suc])
+
+            if num == low[e]:
+                component = tuple(stack[stack_pos:])
+                del stack[stack_pos:]
+                components.append(component)
+                for i in component:
+                    low[i] = len(candidates) + 1
+                    commap[i] = component;
+                comdeps[component[0]] = {}
+                comrdeps[component[0]] = {}
+
+        for e in candidates:
+            candidate = candidates[e]
+            # Skip if e is already in testing? (copy-wasted from the previous code)
+            if e in self.sources['testing'] and self.sources['testing'][e][VERSION] == candidate.ver[1]:
+               continue
+            visit(e, candidate)
+
+        # Now that the strongly connected components are generated, we
+        # generate a DAG using a single element from each component as
+        # a represent of the component.
+        #
+        # y in condeps[x] => x depends on y
+        # y in conrdeps[x] => x is an rdeps of y
+
+        for component in components:
+            for e in component:
+                for d in candidates[e].deps:
+                    if not d in candidates: continue
+                    if d in component: continue
+                    dc = commap[d]
+                    if not dc[0] in comdeps[component[0]]:
+                        comdeps[component[0]][dc[0]] = 1;
+                        comrdeps[dc[0]][component[0]] = 1;
+            # If this component does not depend on anything (outside
+            # itself) then it is a good starting point :)
+            if len(comdeps[component[0]]) < 1:
+                roots.append(component)
+
+        # We look at each of the "bottom" components and add all
+        # reverse-dependencies of said component
+        for component in roots:
+            hint = []
+            seen = {}
+            stack.append(component)
+            seen[component[0]] = 1
+            while len(stack):
+                top = stack.pop()
+                for rdep in comrdeps[top[0]]:
+                    if rdep in seen: continue
+                    stack.append(commap[rdep])
+                    seen[rdep] = 1
+                hint.extend(top)
+            hints.append(hint)
+
+        for hint in hints:
+            # Hints only works if there 2 or more packages
+            if len(hint) < 2:
+                continue
+            self.do_hint("easy", "autohinter", [(x, candidates[x].ver[1]) for x in hint])
 
     def old_libraries(self):
         """Detect old libraries left in testing for smooth transitions
-- 
1.7.6.3

Reply to:

Follow-Ups:
- Bug#645558: Improve britney2 auto-hinter
  - From: Niels Thykier <niels@thykier.net>

Prev by Date: Re: Please drop partman-nbd/kfreebsd-* from testing
Next by Date: Re: quassel translations (was: Re: Bug#641062: pu: package quassel/0.6.3-2)
Previous by thread: Bug#645498: marked as done (nmu: libalien-wxwidgets-perl_0.53+dfsg-1)
Next by thread: Bug#645558: Improve britney2 auto-hinter
Index(es):
- Date
- Thread