[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Report on personal contribution to deity



Hi,
>>"Raul" == Raul Miller <rdm@test.legislate.com> writes:

Raul> Manoj Srivastava <srivasta@datasync.com> wrote:
>> Any comments appreciated.

Raul> Forgive me, but I just skimmed over your post.  I think I'd want
Raul> to isolate the topological sort from the grungy details of what
Raul> the topology represents.

	Hmmm, Ok.

Raul> A topological sort may be obtained by transitive closure on
Raul> dependencies. Here's a fairly trivial algorithm (at the expense
Raul> of O(n^2) memory, and O(n^3.5) time:

	Whooeee. The algorithms I suggested use O(V + E) memory and
 O(V +E) time as well. (V==packages, E==relationships). Linear vs
 N^3.5 --- and this algorithm is not much simpler to implement. Also,
 N is likely to get large (we are already at about 1200), and
 scalability is an issue.

Raul> create an NxN bit matrix, if bit[packagea][packageb] is set,
Raul> then package b must be installed for package a to be installed.

	Unfortunately, that is not a scalable design, espescially for
 such a sparse graph as we have. Therefore, the representation choosen
 is an adjacency list rather than an adjacency matrix (for 2000
 package the matrix has 4 million elements, mostly empty)

Raul> You can get better memory use for the typical case by
Raul> representing dependencies on a linked list.
	
	right.

Raul> Here, you get one
Raul> list per package (corresponds to first dimension in bit matrix
Raul> -- presence of a package in list corresponds to 1 in the second
Raul> dimension). bit[x][y] becomes ismemberof(x, y), and |= becomes
Raul> addifnotpresent(). To assign numerics you count how many are on
Raul> list.

Raul> [Alternatively, you can get an 8x space improvement by using
Raul> packed bits and a slightly more complicated indexing scheme,
Raul> dunno if that's worth it if we're heading towards multi-thousand
Raul> package installs].

	Yes, this is not practicall. One has to determine the point of
 diminishing returns.

Raul> The depth-first algorithm is probably the most sparing on memory
Raul> [just need linear space to catch loops], but unless you adopt
Raul> some representative data structure you'll wind up continually
Raul> analyzing package descriptions to extract dependencies. This
Raul> seems like it could get rather slow... [that's off the cuff,
Raul> I've not thought about this one very seriously].

	You're right about off the cuff bit ;-) The idea is to extract
 the package dependencies into memory (mmapped file, for efficiency),
 and look at the in memory structures from that point. Even my Perl
 based pkg-order does that. Reading from file for all data <shudder>

	manoj

-- 
 How many NASA managers does it take to screw in a lightbulb?  "That's
 a known problem... don't worry about it."
Manoj Srivastava  <srivasta@acm.org> <http://www.datasync.com/%7Esrivasta/>
Key C7261095 fingerprint = CB D9 F4 12 68 07 E4 05  CC 2D 27 12 1D F5 E8 6E


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org . 
Trouble?  e-mail to templin@bucknell.edu .


Reply to: