Re: toppological sorting and datastructures
Hi,
>>"Jason" == Jason Gunthorpe <jgg@gpu.srv.ualberta.ca> writes:
Jason> On 13 Nov 1997, Manoj Srivastava wrote:
Jason> It's not so bad to build, it just means I need to add another
Jason> structure which is a bit of work. Also consider the size
Jason> impact, their is going to be > 2000 of these new items, there
Jason> are only 3000 dependancies though! With 3000 deps and 1000
Jason> packages that is an average of 3 deps per package, I know some
Jason> have more, ie 10. I'm not convinced any speed concerns can be
Jason> applied here.
I don't understand. Why are there going to be >2000 new items?
Are you talking about linked list headers?
Speed concerns: Applying a list selection function, runs
O(|V|), applying an function for all members of the long list
is O(|V| + |E|). As |E|>|V|, there is some extra overhead (whether it
is significant ot not is another matter; since the constants involved
are totally different). (this is not a very telling argument)
I think we need to nail down which algorithm we shall be using
for ordering to see how important multiple list structures are going
to be. As I said, I'd prefer multiple linked lists, but only thinking
the ordering process shall tell us whether it is critical.
======================================================================
Algorithm a: (for packages being installed, removal is similar)
# First generate the indegree of all package-versions
1 For each package-versions which are candidates for installation; do
2 for each dependency, predependency, and packages we are
replacing, and, optionally, packages that have recommended or
suggested us. set indegree
3 package::indegree++;
4 done
5 done
6 For each package-versions which are candidates for installation; do
7 if indegree == 0; then
8 push on todo list
9 endif
10 done
If there is nothing in the todo list, we had a cycle!! (handle
it)
11 while there is a todo list; do
12 pop package off a to do list
13 if we are in uncleared predependent list; then
14 insert a break in processing marker on ordered list
15 clear uncleared predependent list
16 endif
17 push onto ordered list
18 remove from candidates list
19 for each target (reverse dependencies!) of depends and
predepends, and packages we suggest or recommend
20 reduce indegree of target
21 if indegree of target == 0;
22 push target on todo list
23 endif
24 done
25 done
26 If not empty candidates list; there is a cycle!
======================================================================
Algorithm B: Depth first search.
1 For each package-versions which are candidates for installation; do
2 Color each node white
3 done
4 For each package-versions which are candidates for installation; do
5 if color of node is white; then
6 Visit (Vertex)
7 endif
8 done
Visit:
9 color node Gray
10 for each dependency, predependency, and packages we are
replacing, and, optionally, packages that have recommended or
suggested us.
11 if dependency is predepends,
12 add node to list of predepends
13 endif
14 if color of dependency is white (else this was a cycle!)
15 visit (dependent)
16 endif
17 done
18 if we are in the list of predepends
19 add a break event on the ordered list
20 endif
21 add node to order list
======================================================================
What I like about algorithm B is that it is simpler to code
(but not by much), what I like about the first is that one has some
determination in how to break the cycle (search for the node with
minimum indegrtee, and snip a link, preferably a suggests link or a
recommends link). In the second algorithm one just follows predepends
first, then depends, and so on, and hopes for the best about cycle
removal.
Also, the second method does not call for reverse dependencies
as algorithm one does.
Jason> I don't know what you want to do with provides so IsSatisfied
Jason> dep might not be good enough.
Provides should satisfy any non versioned dependency (I think
provides should only provide virtual packages, but I'm not
sure this is ever required anywhere)
Jason> Notice you don't use a 'List of Packages Installed' that
Jason> information is already present in the cache. All of the lists
Jason> you have requested are already present in the cache ;>
I know. But a list of packages installed is an useful
abstraction so I can concentrate on the design without getting lost
in the implementation and getting coerced by implementation details
;-)
Any comments on the above algorithms shall be greatly
appreciated.
manoj
--
Scientific innovation sometimes sounds like poetry, and I would claim
that it is, at least in the earliest stages. The ideal scientist can
be said to think like a poet, work like a clerk, and write like a
journalist. Edward O. Wilson, "Biophilia"
Manoj Srivastava <srivasta@acm.org> <http://www.datasync.com/%7Esrivasta/>
Key C7261095 fingerprint = CB D9 F4 12 68 07 E4 05 CC 2D 27 12 1D F5 E8 6E
Reply to: