Algorithmica (1993) 9:35%381
Algorithmica
(C)1993Springer-VerlagNewYorkInc.
Optimal Parallel Algorithms for Multiple Updates of Minimum Spanning Trees Shaunak Pawagi 1 and Owen Kaser 1 Abstract. Parallel updates of minimum spanning trees (MSTs) have been studied in the past. These updates allowed a single change in the underlying graph, such as a change in the cost of an edge or an insertion of a new vertex. Multiple update problems for MSTs are concerned with handling more than one such change. In the sequential case multiple update problems may be solved using repeated applications of an efficient algorithm for a single update. However, for efficiency reasons, parallel algorithms for multiple update problems must consider all changes to the underlying graph simultaneously. In this paper we describe parallel algorithms for updating an MST when k new vertices are inserted or deleted in the underlying graph, when the costs of k edges are changed, or when k edge insertions and deletions are performed. For multiple vertex insertion update, our algorithm achieves time and processor bounds of O(log n. log k) and nk/(log n. log k), respectively, on a CREW parallel random access machine. These bounds are optimal for dense graphs. A novel feature of this algorithm is a transformation of the previous MST and k new vertices to a bipartite graph which enables us to obtain the above-mentioned bounds.
Key Words. CREW PRAM, Parallel algorithms, Optimality, Updates, Minimum spanning tree.
1. Introduction. Incremental graph algorithms deal with recomputing a solution to a computational problem on a graph after an incremental change is made to the graph, such as addition and deletion of vertices and edges, as well as changes in the costs or capacities (if any) associated with the edges of the graph. Such recomputations are also referred to as "updating" graph properties. The problem of updating a minimum spanning tree (MST) involves reconstructing the new MST from the current MST when a vertex, along with all its incident edges, is inserted or deleted from the underlying graph, or when an edge is inserted or deleted, or the cost of an edge (a tree edge or a nontree edge) has changed. These subproblems are referred to as the vertex update and the edge update problem, respectively. Sequential algorithms for updating MSTs have received considerable attention in the past [1], [5], [19]. In particular, Frederickson [5] describes an O(x/m) sequential algorithm for the edge update problem, where m is the number of edges in the graph. Spira and Pan [19] and Chin and Houck [1] present O(n) sequential algorithms for updating the MST of an n vertex graph when a new vertex is inserted into the graph. These algorithms are efficient when compared with the known start-over algorithms for MST construction; O(m log n) time for sparse graphs and O(n2) time for dense graphs. (A start-over algorithm does not assume existence of any previous MST.) Also, an O(n)-time algorithm for single vertex I Department of Computer Science, State University of New York, Stony Brook, NY 11794, USA. Received August 2, 1990; revised February 15, 1991. Communicated by Kurt Mehlhorno
358
S. Pawagi and O. Kaser
insertion update is optimal, because any such algorithm must examine at ~east O(n) edges (n - 1 edges of the MST and n edges incident on the new vertex) in the worst case. Parallel algorithms for updating MSTs have also been studied [7], E13]~ E213, [23]. Pawagi and Ramarkrishnan [141 were first to investigate this problem on a parallel random access machine that allows concurrent read but prohibits concurrent write (CREW PRAM). Their algorithms solve the edge and vertex update problem within O(log n) time using n 2 processors, z Tsin [21] has extended this work to vertex deletion updates. The processor bound for the single vertex insertion update has been improved by Varman and Doshi E23] to n processors. The processor-time product for their algorithm is O(n log n), which is a log n factor away from the optimal product of O(n). A parallel algorithm is optimal if its processor-time product achieves the lower bound for the same problem on a uniprocessor machine. Jung and Mehlhorn [7] modeled the single vertex insertion problem as a problem of evaluating an expression tree. Although they use the CRCW model, their approach yields an optimal algorithm on a CREW PRAM (O(log n) time and n/log n processors), since optimal algorithms for evaluating an expression tree on all models of PRAM are now known [3"], [6]. The parallel and sequential incremental algorithms mentioned above allow a single change in the underlying graph but not multiple changes. A multiple update problem for MSTs is concerned with reconstructing the new MST from the previous one after more than one change is made to the underlying graph. A multiple change constitutes insertion/deletion of several vertices or changes in the costs of several edges. A combination of edge and vertex updates is also possible. When several new vertices are inserted, we define two problems. In the first, referred to as multiple vertex insertion update, new vertices are independent (nonadjacent), and in the second problem, called graph insertion update, the new vertices may be adjacent to each other. The multiple update problem is particularly important in the context of parallel computing for the following reasons: even if several simultaneous changes are made to the graph, m some sense, a sequential algorithm has an option of considering one change at a time. However, for efficiency reasons, a parallel incremental algorithm should handle all changes simultaneously. Also.~ in the sequential case sometimes it does not matter whether all changes are considered simultaneously or one at a time--the total time required is the same. For instance, a multiple update problem may be solved using repeated applications of an efficient algorithm for the single update. Consider an instance of the multiple vertex insertion update, where k new vertices are added to the graph. Ideally, we must solve this in O(nk) time because there will be nk new edges to be examined in the wors~ case. We can achieve an O(nk) time bound in k steps, where we insert one vertex in each step, and each step requires O(n) time. However, after each vertex insertion step the size of the
z Throughout this paper we use log n to denote Flog2 n~. Also, a processor bound is stated without the constant, since it can be absorbed in the constant for the time bound.
Optimal Parallel Algorithms for Multiple Updates of MSTs
359
tree increases by 1, thereby requiring O(n), O(n + 1), O(n + 2). . . . . O(n + k - 1) time for k steps, respectively. The total time required would then be O(nk + k2). However, if we can reduce the size of the MST to n after each insertion, the total time required would be O(nk). (In Section 2.3 we describe how to reduce the size of the tree to n.) We can achieve O(nk) time complexity even if we consider all vertex insertions simultaneously by applying Spira and Pan's [19] algorithm to multiple vertex insertions. For single vertex insertion, their algorithm examines n edges at each iteration. During an iteration each edge is examined a constant number of times. At the end of an iteration the size of the tree reduces to at most n/2. Therefore the time complexity of their algorithm is governed by the recurrence T(n) = T(n/2) + cn, where c is a constant. A solution to this recurrence relation is T(n) = O(n). Now when this technique is applied to the graph consisting of the previous MST and k new vertices (with incident edges) we require O(nk) time. The details are similar to the single vertex insertion algorithm and are omitted. The recurrence involved is T(n)= T(n/2)+ cnk. A solution to this recurrence is T(n) = O(nk). However, when we attempt this problem on a PRAM, inserting one vertex at a time turns out to be inefficient. On a CREW PRAM, the k-vertex insertion update problem can be solved within O(k log n) time using a single vertex insertion update algorithm k times. (We assume that the size of the tree is reduced to n after each insertion). If k is | the time bound for this problem is not polylogarithmic and this approach does not give us a fast parallel algorithm. Another approach would be application of the start-over algorithm of Chin et al. [-2] or Savage and Ja'Ja' [17] to the graph consisting of the old MST and k new vertices (assuming that graphs are dense). However, when k is very small, say 2, we need O(log2 n) time and 2n processors. In order to strike a balance between smaller and larger values of k, we must have time and processor bounds which vary smoothly with respect to k. As a first objective for k-vertex insertion update, we may attempt to achieve an O(log n" log k) time bound using nk processors. 3 Moreover, the algorithm would be optimal if the processor requirement drops to nk/(log n" log k). Clearly, in order to obtain such bounds for the multiple update problem we need an altogether different approach. In this paper an optimal parallel algorithm for k-vertex insertion updates of MSTs is first presented. Next we discuss a parallel algorithm for k-edge insertion update, which enables us to solve the graph insertion update problem. We also consider parallel algorithms for k-vertex deletion, k-edge deletion, and various edge cost adjustments. Throughout, the CREW PRAM is our model of computation. Our algorithms for k-vertex insertion update and k-edge cost-decrease/insertion require O(log n .log k) time and use nk/(log n -log k) processors. Our algorithm for graph insertion is optimal given certain constraints on the number of edges involved. One of the major contributions of this paper is the transformation of the original MST and k new vertices to a bipartite graph, which leads to the above-mentioned
s These bound were achieved in an earlier version of this paper [12].
360
S, Pawagi and O. Kaser
time and processor bounds. Another contribution is the formulation of k-update problems for MSTs. This paper is organized as follows: In Section 2 we define some graph-theoretic terms and discuss MST algorithmsl The algorithm for k-vertex insertion update is presented in Section 3. Other M S T update problems are described in Section 4. Concluding remarks appear in Section 5. The Appendix discusses several related problems that are required to establish the main result of this paper.
2. Background. We now present some graph-theoretic preliminaries, and discuss some results required in the following sections.
2.1. Definitions. Let G = (V, E) denote a graph where V is a finite set of vertices and E is a set of pairs of vertices called edges. Throughout this paper we assume that V = {1, 2 . . . . . n}, V = n, and [E] = m. Let C: E --* R denote a function that associates a cost with the edges of G. If two edges have identical costs, then the ties in cost comparisons are resolved using lexicographic ordering. This implies that the MST of a graph is unique. We denote the undirected edge from a to b by (a, b). A simple path in G joining two vertices io and ik is defined as a sequence of vertices (io, il, i2, ..., iD such that all of them are distinct and for each p, 0 <_ p < k, (ip, ip+l) is an edge of G. If io = ik, then the path is called a simple cycle. A path is nonsimpIe if some vertices of io . . . . . ik are not distinct. A nonsimple cycle is defined similarly. If not specified, all paths and cycles are simple. We assume the standard definitions for connectivity, component, tree, forest, adjacency matrix, adjacency lists, and a minimum cost spanning tree of a graph. We assume that graphs are represented by their adjacency matrices. Trees are stored using an adjacency list representation. A rooted tree T has a distinguished vertex called roo~. If the edges of a tree T are oriented from a vertex to its parent, then T is an inverted tree. Some computations on trees reqmre that trees be in inverted form. Such a specification can be readily obtained from the adjacency list. 2.2. Discussion of Start-Over Algorithms. Most well-known algorithms for MST construction, e.g., [9] and [15], add, at every iteration, an appropriate edge to the partially computed minimum cost forest. The newly added edge never forms a cycle with the previously identified edges. Therefore, we characterize these algorithms as cycle-avoiding algorithms. We now describe another approach that starts ou! with the given graph and successively removes edges until the resulting graph is an MST. LEMMA 2.1. Let G oe an undirected connected graph, if'an algorithm deletes from G the maximum cost edge on every cycle of G, the resulting graph is an M S T f o r G, irrespective of the order in which these edges are deleted (cycles are broken). PROOF. Let Ga be the graph obtained after all cycles of G are broken (in any order] by deleting the maximum cost edge on them.
Optimal Parallel Algorithms for Multiple Updates of MSTs
361
First, G~ is acyclic as all the cycles are broken by this algorithm. For the sake of contradiction assume that G~ has several components. Consider a set of edges Ee, defined below: Ed = {ele is a deleted edge of G that connects two components of Ga}. Let em be the minimum cost edge among the edges in the set Ee. Now consider any cycle in G that contains %. Such a cycle must pass through at least two components of G,. Therefore it must have at least two edges that are in Ee. It is easy to see that em cannot be deleted as it is not a maximum cost edge on this cycle, which is a contradiction. Therefore there cannot be more than one component, implying that Ga is connected. G, is a tree as it is acyclic and connected. The proof that G~ is of minimum cost follows. If Ga is not of minimum cost, then there exists another spanning tree T whose cost is minimum. Now G~ and T differ at least in one edge, say e. Let e b e in T but not in G~. Remove e from T. This splits T into two subtrees T1 and T2. Let G 1 and G2 be the two subgraphs of G that are induced by vertices of T1 and T2, respectively. By construction, e is the maximum cost edge on some cycle L of G. Now L must have another edge e' (other than e) that connects G 1 and G2, but the cost of e' is less than that of e. Therefore, addition of e' to T and removal of e forms a new spanning tree whose cost is less than that of T, resulting in a contradiction. Therefore G~ must be of minimum cost. [] COROLLARY. An edge of G is in the M S T for G iff it is not the maximum cost edge on some cycle of G. REMARK 1. The above proof does not assume that the cycles are to be broken one at a time or in any specific order. We may consider several cycles simultaneously in each step, or all cycles may even be broken simultaneously in one step. The correctness of any cycle-breaking algorithm does not depend upon how many cycles are broken at each step. REMARK 2. As discussed earlier, all known sequential and parallel algorithms for MST construction are cycle-avoiding and work the opposite way. This approach happens to be efficient for the initial computation of an MST. Cycle-breaking start-over algorithms are inefficient because of the time required to detect a cycle, and because of the large number of cycles that need to be broken. To obtain the new MST after new edges/vertices are added to the old MST, we need only consider the graph that consists of the previous MST and new vertices or edges. (See [1] for a proof.) Therefore, one approach to updating would be to obtain the new MST by applying a start-over algorithm to this graph. Another approach would be to break cycles (induced by insertion of new edges/vertices) in the old MST. Usually, in such situations, the number of cycles in the graph under consideration tends to be small. Hence the cycle-breaking approach proves
362
S Pawagi and O Kaser
more efficient than the cycle-avoiding start-over algorithms. Incremental algorithms for MST edge and vertex insertion are cycle-breaking algorithms. Breaking one cycle at a time is obviously inefficient and should not be considered for any parallel algorithm. On the other hand, a parallel algorithm for updating an MST may not break all cycles simultaneously, if the number of cycles is prohibitively large (e.g., exponential). An efficient parallel algorithm should consider several cycles at each step, and not all cycles in one step.
2.3. A Sequential Algorithm for k-vertex Insertion. In this section we prove that, in the sequential case, it does not matter whether k independent vertices are inserted one at a time or all at once; the time complexity remains the same. Recall that these vertices are not adjacent. First, we approach the problem by inserting one vertex at each step. As mentioned earlier, the desired time complexity, i.e., O(nk), is achievable by reducing the size of the tree to n after every insertion step. The details are given below. Note that such a reduction is possible because inserted vertices are not adjacent. Let T b e the old MST, and let T1 be the new MST after zl, the first new vertex, has been inserted in T. Since z~ is not adjacent to other new vertices we can eliminate zl or one of its neighbors. Let Vo, vl,-.., vj be vertices adjacent to z~. Let (zl, vo) be the minimum cost edge incident on z v Clearly, the edge (zl, Vo) is in the final MST. Hence, we need not consider it during the next vertex insertion steps. We remove this edge by merging zl and Vo. The vertices adjacent to vo are now adjacent to zl. It is obvious that the size of the resulting tree is n. The final tree consists of the resulting MST after k insertions, with the addition of the k - 1 edges that were removed in k - 1 size reduction steps. 2.4. Start-over Algorithms on PRAMs. In this section the start,over algorithms for MSTs on PRAMs are discussed. Savage and Ja'Ja' [17] described the first parallel algorithm for MST construction. Their algorithm constructs an MST within O(log 2 n) time using O(n2) processors. These bounds were improved by Chin et al. [2]. Their algorithm determines an MST in O(log z n) time using O(nZ/tog 2 n) processors. Chin's algorithm is optimal for dense graphs, in the sense that its processor-time product achieves the sequential lower bound (i.e., n2). The basic technique used by both algorithms is essentially the same. In the Appendix we discuss an application of this technique to constructing an MST of a bipartite graph. The resulting bounds are O(log n.log k) and nk/(log n.log k) for time and processors, respectively, where n and k are the number of vertices in the two partitions. This result is required in Section 3. Throughout this paper we assume that graphs are dense. Parallel algorithms for MST construction, described in [10] and [18], use m processors instead of n 2. Thus, for sparse graphs, they use fewer processors than used by Chin's algorithm. Since we assume dense graphs, we have compared our incremental algorithms with that of Chin. 2.5. Single Vertex M S T Insertion. In [7] Jung and Mehlhorn present the idea of modeling the single vertex insertion problem as a problem of evaluating an expression whose operators are small expressions involving maximum and mini-
Optimal Parallel Algorithms for Multiple Updates of MSTs
363
mum over edges (by cost). They show that there is such an expression for every vertex of the MST and if the value of every expression is known, the updated MST can be determined. Their model of computation is the CRCW PRAM. In order to apply their technique on the CREW model, we extend Gazit's [-6] method of expression evaluation. The Appendix sketches the approach of Jung and Mehlhorn. 2.6. Related Tree Problems. In the next section we need algorithms for several simple tree or forest problems. Specifically, we need to shift the root of a tree, identify the connected components in a forest, and determine the maximum cost edge on the path from a vertex in a tree to its root. It is well known that the first problem can be solved in O(n/p) time using p _< n/log n processors [20], given an optimal list-ranking algorithm as in [4]. The rest of the problems are discussed in the Appendix.
3. Multiple Vertex Insertion. Let T be an MST for G and let E r be the set of edges of T. Let Vk = {Zl, Z2 . . . . . Zk} be the set of new vertices. Each new vertex may have at most n edges incident on it. Let G' be the graph after vertices of Vk are added to G. Edges of G' are the edges of T and the new edges joining vertices of T to vertices of Vk. Throughout, G' is the new graph and T' is the new MST that we wish to compute. In order to obtain T', we break cycles of G' by deleting the maximum cost edge on each of them. To achieve the desired time and processor complexities, we break these cycles in two separate stages. First, k copies of T are made. A different new vertex is inserted in each copy using our algorithm for single vertex insertion update. This results in k different trees. These k trees are merged to construct a graph Gz such that an edge of the old MST T is in Gz if it is in all k trees. Every new edge retained in any tree is in G~. The graph Gz may be viewed as a set of components (subtrees) of T that are held together by new edges incident on them. Using arguments similar to those in Lemma 2.1, we can prove that Gz is connected. Since it may still have cycles (if it is acyclic we are done!), these are broken by employing a novel technique. First Gz is transformed into a bipartite graph Gb. An MST Tb for Gb is then constructed by applying a start-over algorithm. Edges of Gb and Tb aid in identifying the edges of Gz that must be removed from G~ to obtain T'. In the next two subsections we describe the construction of G~ and its transformation to the bipartite graph Gb. The correctness of our algorithm is established by a series of lemmas proved in Section 3.3. 3.1. Construction of Gz.
The following steps are executed to construct Gz.
1. Make k copies of T, namely T 1 , T 2 . . . . , Tk. Since we can make k copies in constant time using nk processors, this can be done in O(log n" log k) time with nk/(log n. log k) processors. 2. For all i _< k, insert zg in T~. By Theorem A.1 (A for Appendix), and choosing p = n/log n .log k we require O(log n -log k) time for T~. Thus for k copies we need nk/(log n. log k) processors.
364
S. Pawagi and O. Kaser
3. Construct a new graph Gz = (V~, E,) as follows. V~ = V ~ Vk. F o r all i <_ k, edges incident on zi that are present in T~ are in E~. Edges of the old M S T T that are retained in all T~ are also in E~. N o other edges are in E~. Edges of T a r e selected by testing each edge with an A N D operation over k flags, one for each copy of the edge. This requires O(log n. log k) time with nk/(log n. log k) processors
E2]. LEMMA 3.1.
Construction of Gz requires O(log n'1og k) time and nk/(log n . l o g k)
processors. PROOF.
The p r o o f is immediate from the discussion above.
3.2. Transformation.
[]
Before we describe the transformation we discuss the bipartite nature of G~. The graph Gz has two partitions: one consists of c o m p o n e n t s of the old M S T T; the other partition is Vk. We define v(p) to be the n u m b e r of new vertices on path p. Since Gz is obtained by merging k trees, any cycle c of G z has v(c) > 2. (If not, one of the trees contained a cycle.) These cycles have a bipartite flavor in the following sense. Any cycle of G~ can be described as an alternating sequence of new vertices and c o m p o n e n t s (subtrees) of T. F o r instance, in Figure 3.1, z 1, Ca, z2, C1, and back to z 1 is one such cycle. W h e n we trace one such cycle the following pattern is observed: it starts at a new vertex, then it enters a c o m p o n e n t of T, traverses some edges of that component, leaves that component, visits a new vertex and so on. A cycle enters a component at an entry-vertex and leaves it via an exit-vertex. Since these vertices can be used either for entry or exit, we call them e-vertices (e for entry/exit). An e-vertex is a vertex in T adjacent to any new vertex. F o r an e-vertex v, consider the (simple) path in Gz from a new vertex zl to v such that no other new vertex is on it. The first edge of this path is incident on z i. The rest of the edges belong to the same c o m p o n e n t of T that v does. Such a path from z~ to v is referred to as an i-path (i for internal) of G~. We transform Gz into a bipartite graph Gb, with two partitions V~ and Vk, where V~ is the set of e-vertices of T The following procedure is executed to determine edges of Gb. These steps are for every new vertex zi.
Cl
C2
i C3
Fig. 3.1. The graph Gz with two new vertices and three components.
Optimal Parallel Algorithms for Multiple Updates of MSTs
365
1. For every e-vertex v, determine the maximum cost edge on the i-path from z~ to v. If such a path does not exist then the edge (zi, v) is not in Gb, else 2. The edge (zi, v) is in G b, and is labeled with the maximum cost edge computed in the first step. Note that there is an obvious bijection between edges of Gb and the i-paths of G~. The computation above is done as follows. Recall that components of T are held together by edges incident on k new vertices. Clearly, no component of T has more than k edges incident on it. By Lemma A.2, components of Tcan be identified in O(log n) time using n/log n processors. (a) Identify all e-vertices. Since an e-vertex is adjacent to a new vertex, this can be done by performing an OR operation on selected rows of the adjacency matrix for G~. Techniques given in [2] suffice to process these ORs in the desired bounds. (b) For every z~, make a copy of each component of T and root it at the e-vertex (if it exists) adjacent to z~. This requires O(log n" log k) time with n/log n -log k processors (see Section 2.6). Since we have k new vertices, we make k copies of components of T and reroot them. This requires O(log n. log k) time using nk/(log n" log k) processors. (c) For every new vertex zl, form one tree from its copies of components by making z~ the new root. This tree has at most n + 1 vertices. Determine the maximum cost edge on the path from every vertex to the root. By Lemma A.1, we can do this in O(log n ' l o g k) time with n/log n-log k processors. The total time and processor requirements for step c are O(log n. log k) and nk/(log n. log k), respectively. In this fashion we compute the maximum cost edge on the path from every new vertex to every e-vertex, i.e., every i-path. We therefore have the following lemma. LEMMA 3.2. We can construct the graph G b from G z in O(log n' log k) time using nk/(log n "log k) processors. PROOF. The proof follows from the preceding discussion.
[]
~r observe that several edges in G b may be labeled by the same edge in G z. Therefore we need to define the cost ordering >_b on edges of Gb. If two edges of Gb have different labels, then the original cost ordering (in Gz) of the two labels is preserved. If two edges have the same label, ties are broken by lexicographic ordering. We refer to all edges in Gb labeled with a given edge e as copies of e. For notational convenience, we write (x, y),, to refer to the edge in G b that we consider the ruth copy of edge (x, y). The edges of G~ are partitioned into three classes. The first class contains edges that have no copies in Gb. An edge is in the second class if it is the maximum cost edge on some cycle of G z. An edge is in the third class if it has copies in Gb and it is not the maximum cost edge on any cycle of G~. Observe that the edges of T' (the final MST) are edges of the first class and edges of the third class, because
366
s. Pawagi and O. Kaser
none of them is a m a x i m u m cost edge on any cycle of G~. Therefore, T' is obtained by deleting class two edges from G~. In order to determine edges of class two we compute an MST Tb for Gb. An edge is of class two iff its least cost copy is not in Tb. (Actually, no copy of a class two edge will be in Tb.)
3.3. Correctness. To prove the correctness of our algorithm the following statements about the three classes need to be proved: (a) Edges of the first class are in the final MST (Lemma 3.3). (b) Edges of the second class have no copies present in Tb (Lemma 3.5t. (c) The least cost copy of every edge of the third class is retained in Tb I Lemma 3.6). LEMMA 3.3.
If an edge of G, has no copies in Gb, then that edge is in the final M S T.
PROOF. Let tu, v) be an edge in G~ that has no copies in G b. This implies that [u, v) is not the m a x i m u m cost edge on any i-path of G~. Therefore. it is not the maximum cost edge on any cycle (which is formed by such paths) of G~. By the corollary of L e m m a 2.1, (u, v) is in the final MST. [] Any path p in Gb corresponds to a (possibly nonsimple) path xp in G~. We refer to xp as the expansion of p. xp can be derived by replacing each edge on p by its corresponding i-path. We are now ready to prove a crucial lemma. The proof of correctness for our algorithm depends primarily on this lemma. LEMMA 3.4. If(x, y) is the maximum cost edge on some simple cycle of G~, then, for any copy (X, Y), in Gb, there is at least one simple cycle in Gb where (x, y)~ has the greatest cost (and is the only copy of (x, y)), PROOF. Let (x, y) be the maximum cost edge on cycle L of G~. Without loss of generality, assume that cycle L is zl, C 1, z2, ..., z 1, where the Ci's are components of T. For simplicity, assume that the Ci's are distinck At the end of the proof we consider the more general case. If there are several simple cycles on which (x, y) has maximum cost, let L be one which minimizes v(L) (i.e., it has as few new vertices as possible). Let P be the part of L that is in C1. Without loss of generality, suppose (x, y) is either on P or is the edge from z i to C v Let the first and last e-vertices on P be vl and v z, respectively. Note that v~ and v2 are respectively adjacent to zl and zz. Figure 3.2 illustrates these and other vertices that are introduced in the proof. Let L Z be the part of L from z z to z 1 that does not contain P. Since L is simple, (x, y) does not appear on L~. Therefore any edge on L z has cost less than that of (x, y). Let Lb be a path in Gb whose expansion is L~. To obtain L~ from Lz, in every sequence z~, Ci, z~+ 1, replace Ca by v~, where v~ is any e-vertex in C~ on the path between zi and z~+ 1. Clearly, L b is simple and its edges have less cost than any copy of (x, y). Now suppose that (x, Y)r is the edge (z,, vw) ~ Eb. Its expansion in Gz is the i-path from some new vertex zu to some e-vertex vw. This i-path must contain (x, y) as
Optimal Parallel Algorithms for Multiple Updates of MSTs
367 V1(~..,,
~!~ . . . . ~..
V s~
/~
/ V
\
~
9
\ / \ ~",,
~1
\
\ "
zll
.........:2 o
o
9 Gz
z2 ~ / "
9 Gb
Fig. 3.2. Case 2 of Lemma 3.4. The component C 1 Of G z and a corresponding portion in Gb. In G~ straight lines denote edges and jagged lines denote paths. In Gb the path from zl to z2, which is relevant to the proof, is shown by thick lines.
the maximum cost edge. Since (x, y) is in Ca or incident on it, vw must be an e-vertex in C~. Now let P' denote this i-path from zu to vw. Note that Gb may have several copies of (x, y). Since our objective is to construct a simple cycle c b in G b using L, copies of (x, y) are classified with respect to L in two ways. The first group contains copies that are incident on zl or z2, and the second group contains the other copies. The proof considers these two cases separately, wherein we construct Cb such that it contains (zu, vw) as its maximum cost edge and its only copy of(x, y). Recall that L has two parts: one is L~ and the other is a path from z I to z 2. As already explained, from L z we obtain L b. This path forms a part of %. The rest of Cb is a simple path from zl to z z, and by construction (z,,, v~) is on it. We show that this path in Gb has no other copy of (x, y) and that (z,, Vw) is of maximum cost. We also show that it forms a simple cycle with L b. Case 1: zu is zlo (Thus, (x, y) is on the expansion of(z1, vw). The case where z, is z z is similar, but is excluded by our earlier assumption about (x, y).) The required path in Gb is z~, v~, z z. Clearly, this path contains (z~, v~), which is (x, y),. Now, consider (v~, zz). Its expansion cannot contain (x, y). To see this,
observe that (x, y) is on the i-path from z 1 to vw, and is on the i-path from z I to
368
S. Pawagi and O. Kaser
/32 (since (x, y) is on P or incident on zl). Thus the path (within C1) from v,~ ~0 v2 cannot contain (x, y). Note also that all edges on this path are in P u P'. Thus they all have a smaller cost than (x, y). By our assumption, the edge (x, Yl is not incident on z2. Thus, the edge (Vw, z2) of Gb has a lower cost than (z,, vw), and is not a copy of (x, y). Thus, when the cycle c b is completed with za, v~, z2, we will have ~x, y)~ as the maximum cost edge, and as the only copy of (x, y). Note that cb is simple by the assumption that the C{s are distinct. Case 2: z, is neither z I nor z2. (Thus (x, y) is in Ca. Since it must be on P. it cannot be incident on zu.) Let (z,, vs) be the edge of G, joining z. to C1. Now note that removal of ix, y) from Ca splits it into two connected subcomponents, one connected to z~ and the other to z2. Let CL1 be the former and let C~.z be the latter. Without loss of generality, let vs be in Ca. 1. Therefore vw is in C~,2, since (x, y) is on the path between them. The path (within C1.2) from Vw to v2 cannot contain (x, y). Since all the edges of this path are in P u P', they all have a smaller cost than (x, y). Together, this path and (Vz, z2) correspond to the edge (v,~, z2) of Gb. Therefore, (vw, z2) has a lower cost than (z,, vw), and is not a copy of (x, y). Likewise, the path (within Ca. 1) from vl to v~ only has edges of P w F , and does not contain (x, y). Therefore, all of these edges have lower cost than (x, y). Together, this path and edge (zl, v0, correspond to the edge (z~, v~) of Gb, which has a lower cost than (z., vw). Likewise, it is not a copy of (x, y). Finally, note that the edge (v~, z,) of Gz has a lower cost than (x, y), since it is on the i-path from z. to vw which contains (x, y). Edge (v~, z,) in G, is the (trivial) expansion of (v~, Zu) in Gb. This edge of G b thus has a lower cost than (z u, vw). Obviously, it cannot be a copy of (x, y). Our path in Gb will be z~, vs, z,, vw, z2. It contains (z., vw) and has no other edge with a larger cost. Since none of the edges in Lb is a copy of (x, y), or has a cos~ larger than (x, y)~, when we complete cb as specified, (x, y)~ is the only copy of (x, y) and has the largest cost. Now, if Zu is not on Lb, Cb would be simple because the C~'s are distinct. Next we show that z u cannot be on Lb. Suppose, to the contrary, that it is. Thus Lb is z2, ..., z, . . . . . zl and cycle cb is zl, vs, Zu, vw, zz . . . . . z,,, . . . , z~. Now consider the cycle c~, = z,,, v,,,, z 2 , . . . , z,,, which is a part of cb. Observe that on the expansion of c~,, (x, y) occurs exactly once and it is the maximum cost edge. Note also that the expansion of c; may not be a simple cycle in G~. Regardless, if the nonsimple portions of the cycle are removed, we derive a simple cycle L' in G, that contains (x, y) as its maximum cost edge. However, z I is not on L', but is on L. Thus v(L') < v(L), a contradiction. Therefore z, cannot be on L b. Thus we have analyzed both cases, and shown that in each case we can construct a simple cycle in Gb on which (x, y)~ has maximum cost. Now, suppose that the C{s are not distinct. Recall that (x, y) is in, or incident on, C a. Consider the case where a component Ct occurs more than once on L. First, assume that C, is not C1. Since L is simple, the vertices on each part of L passing through Ct are disjoint. Thus, by the construction of Lb, vertices from V~ occur at most once on Lb. It is also apparent that vertices from Vkmay only appear once on L~.
Optimal Parallel Algorithms for Multiple Updates of MSTs
369
If C1 repeats, the cycle Cb will still be simple. Suppose that L is zl, C~, z 2 .... zA, C~, z 1. Recall that z~ is adjacent to only one e-vertex v~ of C~. Then the edge (vl, zl) occurs twice on L, contradicting our assumption that L is simple. Similarly, L cannot be z~, C~, z2, CI, z3 . . . . . z~. Thus, L must be zx, C1, z 2 , . . . , ZA, C1, zB,..., zl. (C~ may occur more than twice, but the argument is easily generalized for this case.) Consider the portion of L (actually L~) from z a to z B. It corresponds to a part of L b. By the construction used for Lb, we know that some e-vertex representing C~ is on this part of L b. Also, note that this e-vertex must be on the portion of L z between z A and z B. When we complete c b from L b by adding a path from z~ to z2, Cb may become nonsimple. We show that if cb is not simple, we can find a simple cycle L' in Gz. L' has (x, y) as its maximum cost edge, and v(L') < v(L), a contradiction. We show this separately for each of the cases considered previously in the above proof. Case 2 of the proof is the more interesting case, so we consider it first. Suppose that cb is nonsimple. Recall that L b is simple. Now one or both of vs and vw must be on Lb, since these were the vertices (of C1) that have been added to L b to form %. Suppose Vw is on Lb. Since vw is in C1, Lb is z z . . . . . z A, v,~, z~ . . . . . z 1. Thus Cb is Z 1, V~, Z,, V~,, z2, . . . , ZA, Vw, ZB, . . . , Zl. NOW consider the simple cycle c~ in Gb, which is v~, zB. . . . . za, v,, z,, vw. It is a part of Cb and its expansion in Gz contains (x, y) precisely once, since (x, y) is on the expansion of (z,, vw) and nowhere else. Observe that (x, y) is also of maximum cost on the expansion of c~. Note that the expansion may be nonsimple. However, if we remove the nonsimple portions, we have a simple cycle U in G~ that contains (x, y) as its maximum cost edge. Now, note that L' does not have either zz or ZA (which are distinct). While L' does contain z,, it contains no other new vertex not on L. Thus v(L') < v(L), which is the desired contradiction. Similarly, suppose z~ is on L b. The proof proceeds as for z~. We can find an L' that contradicts the minimality of v(L) by removing z~ and zB, and adding only z,. The details are omitted. For case 1, suppose that Cb is nonsimple. This means that v~ must appear on L b. The proof proceeds as before. We find an L' contradicting the minimality of v(L) by removing z z. (Both L' and L will contain z,, since z, is zv) Again, we omit the details. []
LEMMA 3.5.
I f an edge (x, y) o f G z is the m a x i m u m cost edge on some cycle o f G~, then no copy o f ( x , y) is in the M S T Tb.
PROOF. Suppose a copy (x, Y)r is in Tb. Now by the corollary of Lemma 2.1, there cannot be a cycle in G b on which (x, y), has the maximum cost. Since (x, y) is the maximum cost edge on some cycle of G~, by Lemma 3.4 there must be at least one cycle in Gb such that (x, Y)r is the maximum cost edge on it. The above contradiction implies that no copy of (x, y) is in Tb. [] LEMMA 3.6.
I f ( u , V) is an edge o f class three, its least cost copy (u, v)l is in
Tb.
370
S. Pawagi and O. Kaser
PROOF. Suppose that (u, v)~ has been deleted. By the corollary of Lemma 2.1, it must have been the maximum cost edge on some cycle of Gb. Since (u, v)z is of least cost, there could have been no other copy of (u, v) on this cycle. Using arguments similar to those in the proof of Lemma 3.4, we can map such a cycle back to a cycle in Gz, where (u, v~ has the maximum cost. lDelete nonsimple portions of the expanded cycle.) This contradicts membership of (u, v) in the third class, implying that the least cost copy of (u, vl is retained in Tb. [] THEOREM 3.1. Our algorithm compures the new M S T O(log n" log k) time using nk (log n" log k) processors.
T'
for
G' w~chin
PROOF. The whole process of computing the new MST T' involves determining cycles of G' and deleting the maximum cost edges from each of them. First, we make k copies of the old MST, and in each copy a different vertex is inserted. Every edge deleted in this step is the maximum cost edge on some cycle in some copy and cannot be in the new MST. Next, the graph G, is obtained by merging the k trees. At this stage we need to detect cycles of Gz and delete the maximum cost edge on each of them. This is accomplished by transforming G~ to Gb and then computing an MST Tb for Gb. The transformation and the construction of Tb identify a set of edges (class two) that needs to be removed from G~. Deletion of these edges from G z yields the required MST T', An edge is of the second class iff its least cost copy is not in Tb. For every edge e of G~ that is transformed, we examine Tb to see if the least cost copy of e is present. If not, e is deleted from G~. (Computing the least cost copy of an edge is discussed in the Appendix.) Correctness of the procedure that determines edges of T' is established by Lemmas 2.1 and 3.3-3.6. The time and processor complexities follow from Lemmas 3oi, 3.2, and A.3. D
4. Other MST Update Problems. There are several other MST update problems that we consider in this section. These include multiple edge insertion/deletion, changes in the costs of several edges, multiple vertex deletion, and graph insertion. It is well known that it is possible to consider the insertion or deletion of an edge as a cost update problem by assigning infinite cost to nonexistent edges [1]. Thus an edge insertion or edge deletion update can be modeled as a cost decrease or a cost increase update, respectively. 4.1. Edge Cost Updates. This problem deals with computing the updated MST T' after the costs of several edges in the underlying graph are increased or decreased. A more general problem involves a set of k-edge cost updates that includes both increases and decreases. An edge occurs only once in the set. The general problem is readily solved by first performing the cost increase update, followed by the cost decrease update. The resulting underlying graph is the same regardless of the order of the updates. From the uniqueness of MSTs for graphs with unique edge costs, the final MST must be the same regardless of the order.
Optimal Parallel Algorithms for Multiple Updates of MSTs
371
4.1.1. Decrease in Edge Costs. The single edge cost decrease update has been previously considered in [1] and [5]. If the edge (x, y) whose cost has been decreased is in MST T, then T' = T. Otherwise, T' can be obtained by one of the following two approaches. The first approach [5] is to add the edge to T, inducing a cycle. The new MST is obtained by deleting the maximum cost edge from this cycle. This cycle can be determined by computing the least common ancestor of x and y. Using known techniques [20], on a CREW PRAM this approach yields a parallel algorithm with O(log n) and n/log n time and processors bounds. However, this approach does not lead to a resource-efficient algorithm for the k-edge cost decrease update. To see this, observe that the number of different cycles that need to be considered grows exponentially with k. It is difficult to select the appropriate set of k cycles that delete all the required edges. However, by generalizing the approach of Chin and Houck [1] we arrive at an algorithm that solves this problem in O(log n.log k) time with nk/(log n.log k) processors. Chin and Houck [1] show that an algorithm for single vertex insertion update can solve the single edge cost decrease problem. Let w be the new cost of the edge (x, y). Let E T be the edges in the existing MST T. If (x, y) r ET, we remove (x, y) from G, and replace it with a pair of edges e 1 = (x, z) and e 2 = (z, y), where z is a new dummy vertex of degree two. We set the cost of e I to w, and that of e z to o% where - go is some cost less than the cost of any edge in G. Observe that the resulting graph is a homeomorphic expansion of the original. Note also that the maximum cost edge on any cycle is the same as the maximum cost edge on the corresponding cycle in the original graph. Computing the MST for the expanded graph is essentially inserting z, with e 1 and e2, into T. The required MST is obtained by fusing z and y. Therefore, the single edge cost reduction update problem is no harder than single vertex insertion, and can be solved in the same bounds. The construction and method above can be extended to handle the k-edge cost decrease problem. Specifically, we replace every nontree edge whose cost has decreased with two new edges and one new dummy vertex. We assign one edge the decreased cost, and the other - o e . Using the multiple vertex insertion algorithm, we find an MST Tk. Fusing the new dummy vertices as above, we obtain T' from Tk. Correctness of this approach is again due to the fact that the modified graph is a homeomorphic expansion of the original, and that the maximum cost edge on every cycle is preserved. -
LEMMA 4.1. On a C R E W P R A M , we can update the M S T for graph G in O(log n.log k) time with nk/(log n.log k) processors if the costs of k edges are decreased. PROOF. The proof follows from the preceding discussion.
[]
4.1.2. Increase in Edge Costs. Let (x, y) be the edge whose cost has increased. If (x, y) is not in ET, T' = T. Otherwise, remove (x, y) from T, splitting it into two components. We next find the minimum cost edge in G that joins these compo-
372
S. Pawagi and O. Kaser
nents, and whose addition forms T'. It has been shown in [19] that if no auxiliary data structures are maintained, O(n 2) operations are required to solve this problem for dense graphs, in the worst case. On a C R E W P R A M this reqmres O(log n) time with n 2 processors [131, reduced to O(nZ/log n) in [22]. This result is, in some senses, optimal for dense graphs. However, Frederickson [51 has found a data structure that allows a unlprocessor algorithm to update the MST repeatedly in O(w/mm)time, after some initial computation to set up the data structure. If auxiliary data structures are allowed, then we are far from the best processor-time product. The parallel approach we have just described can be generalized to solve the k-edge cost increase problem. Removing all the affected edges from T, we form O(k) components which we identify. Then, following the approach in [21], we find the minimum cost edges between each pair of components. Treating each component as a vertex, we find an MST (using the minimum cost edges) of the components and add these minimum edges back to form T'. All this can be done in O(log n ~ log 2 k) time with nZ/log n processors. A final pass using optimal Euler-tour techniques may be required to orient the edges of T' correctly and to reroot it.
4.2. Vertex Deletion Update. This problem involves finding an updated MST after one or more vertices are deleted from the underlying graph. Frederickson's edge-deletion algorithm provides a uniprocessor algorithm that has good performance if the degree of the vertex being deleted is small. Tsin [21] has provided a CREW algorithm that updates the MST after some vertex is deleted. This algorithm has a processor-time product significantly worse than that of Frederickson's uniprocessor algorithm, even for dense graphs if the degree of the deleted vertex is small Tsin's approach to the vertex deletion problem is an application of the k-edge cost increase algorithm. Edges incident on the deleted vertex are the edges for the k-edge cost increase update, but only edges that are in T need to be considered. The time and processor requirements then depend on degr(v ), the degree in T of the vertex v that is being deleted. The time required is O(log n ~ log 2 degr(v)) with na/log n processors. In the worst case these are no better than completely reconstructing the MST [13], [21!. It is easy to generalize Tsin's algorithm to handle the k-vertex deletion problem. All edges of T that are incident on vertices in the delete-set D are the edges for k-edge cost increase update. The number of connected components of 7' when the edges are removed is bounded by # = minimum(1 7- ~d~o degT(d), nt. Following the analysis of the previous subsection the bounds for the k-vertex deletion update are O(log n + log 2 #) time with nZ/log n processors. 4.3. Graph Insertion Update. We finally arrive at an interesting result of this section, which discusses an update of T, when a weighted graph Gk = (Vk, Ek) of k vertices is inserted into G. Vertices of Gk also have edges incident on vertices of G, and there are at most nk such edges. Graph insertion can be used to solve the communication network merging problem, whereby a smaller communication network (modeled as a graph) is linked with a larger communication network. The
Optimal Parallel Algorithmsfor Multiple Updates of MSTs
373
larger network has had an MST computed to support broadcasts through the network efficiently, and now we need an MST for the new network. In this section we present two variations of an algorithm that solves this problem. Our algorithms are no more efficient than reconstruction unless the graph Gk being inserted is significantly smaller than G. Thus we assume Gk is smaller than G. One variation of our algorithm is more efficient if Gk is sparse; the other is more efficient if Gk is dense. In our approach, first the vertices in Vk are all inserted independently followed by insertion of edges in Eg. Since we have efficient algorithms for the multiple vertex insertion update, and for the k-edge insertion update, we have efficiently solved the graph insertion problem if leg[ is sufficiently small. Otherwise, we can reduce the number of edges by first finding an MST Tk of Gk. Only edges in Tk need to be inserted and there are fewer than k of these. THEOREM 4.1. On a C R E W PRAM, updatin O T after insertion of k adjacent vertices can be done in O(log(n + k).log k) time with (nk + kE)/(log(n + k)'log k) processors. PROOF. First, we find an MST Tk for Gk. This requires O(1og 2 k) time with k2/log 2 k processors [2]. Next, we insert the members of Vk independently. From Theorem 3.1, this step requires O(log n'log k) time and nk/(log n -log k) processors. Finally, we insert the edges from Tk into the resulting (n + k)-vertex MST. There are at most k - 1 such edges, so by Lemma 4.1, this step can be done in O(log(n + k) -log k) time with (nk + k2)/(log(n + k)" log k) processors. The time as well as processor requirements of the third step are dominant and hence determine the bounds claimed. Observe that the above argument is obvious in the case of the time bound. For the processor bound, note that the bounds of the second and third step share a common factor of k/log k. After removing this factor, the bounds are respectively n/log n and (n + k)/log(n + k). Clearly, (n + k)/log(n + k) >_ n/log n, since x/log x is an increasing function of x, for x > 1. For correctness, clearly the process of inserting the vertices as if they were nonadjacent, and then adding the edges between them, yields the same underlying graph G' as inserting all of Gk. By the uniqueness property, the new MST thus obtained is the correct one. We must now argue that it is correct to add only the edges of Tk. Any non-MST edge from Gk has maximum cost on some cycle in Gk. Then it has maximum cost on the corresponding cycle in G', and hence it is not in the MST for G'. [] COROLLARY. I f G k is sparse and IEkl = q, the same problem can be solved in time O(log n. log k + log(n + q). log q) with
i nk (n + q)q ~ } max log n'log k)' (log(n + q).log processors.
374
S. Pawagi and O. Kaser
PROOF. We omit the step of finding Tk and instead insert all edges. Note that if Gk is very sparse (maybe even fragmented), then q could be less than k, preventing further simplification of bounds. [] Note that the time and processor bounds degenerate to those of the MST construction algorithm in [2] if k is | Note also that our algorithm is optimal, if Ga is dense and has O(k z) edges, and the number of edges between G~ and G is | In this case there are | + k2) edges that must be examined, and the processor-time product of the above algorithm is also O(nk -~ k2). We also note that the MST construction problem for a p-vertex graph is essentially a graph insertion update (independent of the techniques used to solve the graph insertion update), where we insert the given graph into an empty graph (n = 0 and k = p). Therefore the graph insertion update is at least as hard as MST construction. Thus any lower-bound on the time (parallel or sequential) for MST construction is a lower bound on the time required for graph insertion update. 5. Conclusions. Incremental graph algorithms deal with recomputing properties of a graph after an incremental change has been made to it. In this paper we have described efficient parallel algorithms that update an MST when multiple changes are made to the underlying graph. The complexities of the algorithms discussed are summarized in Table 5.1. The major contribution in this paper is the formulation of multiple update problems, and establishing this importance in the context of parallel computation. Another major contribution of this paper is a novel transformation of the graph under consideration to a bipartite graph, which allowed us to obtain optimal bounds for several of the problems. We have successfully argued, in some sense, that if rapid changes are being made to the graph, then it pays to batch these changes and then consider one batch at a time. Such an approach is correct because the order in which updates of a batch are performed does no| affect the final MST. Acknowledgments. We thank the anonymous referees for their valuable comments. In particular, the proof of Lemma 3.4 has been considerably improved due to suggestions by one of them. Table 5.1o Summary of MST update problems on CREW PRAMs. Problem
Time
Processors
Single vertex insertion k-Independent vertex insertion k-Vertex graph insertion Single edge cost decrease/insertion k-Edge cost decrease/insertion Single edge cost increase/deletion k-Edge cost increase/deletion Single vertex deletion k-Vertex deletion
O(log n) O(log n'log k) O(log(n + k) "log k) O(log n) O(log n.log k) O(tog n) O(log n + log 2 k) O(log n + log 2 degr(V)) O(log n + log 2 fl)
n/log n nkfllog n.log k) (nk + k2)/(log(n + k). log k) n/log n
nk/(log n.log k) n2flog n na/log n n2/log n n2/log n
Optimal Parallel Algorithms for Multiple Updates of MSTs
375
Appendix.
In this appendix we first discuss a parallel algorithm due to Jung and Mehlhorn for single vertex insertion update. Next we describe our parallel algorithms for some tree problems. Then we discuss how to determine the least cost copy of an edge transformed to Gb. Finally we present a start-over algorithm for constructing the M S T for a bipartite graph. In the next section we transform a tree of arbitrary degree to an equivalent binary tree. Such a binary tree has the property that there is a 1-1 mapping from old vertices to new, and the value that should be computed for an old vertex is simply the value computed for the corresponding new vertex. In addition, if there were n vertices in the original tree, there are O(n) vertices in the binarized tree. This is described in [3].
A.1. Single Vertex Insertion MST Update. In [7] Jung and Mehlhorn solve the single vertex insertion problem by evaluating an expression associated with the original MST, where an operator is assigned to every vertex of the tree. While the M S T corresponds to the entire expression, the subtree rooted at every vertex corresponds to a subexpression. The operators of these expressions involve m a x i m u m and minimum (by cost comparisons) over a set of edges. They show that if the value of every subexpression is known, the updated M S T can be found in O(log n) time with n/log n processors. Suitable optimal techniques for evaluating these expressions, 4 on the C R E W model, have been presented in [3] and [6]. These techniques require that the arity of operators (number of arguments) is bounded by a constant. Thus their first step is to change the M S T T t o a bounded degree tree. For simplicity, T is changed to an equivalent binary tree T1, where all d u m m y edges have cost - o e . We now proceed to describe the idea of Jung and Mehlhorn. Let T' denote the new M S T that we wish to compute. Let v be some vertex of G, and let z be the new vertex that is being inserted. A descending z-path from vertex v to z starts at v and follows (zero or more) tree edges toward the leaves. Its last edge is from one of the descendants of v (to z). The minimum descending z-path (mdzp) from v is the descending z-path whose costliest edge has the least cost a m o n g the costliest edges, one for every descending z-path from v. Corresponding to TI, an expression tree can be constructed whose evaluation at every vertex yields the costliest edge on the mdzp from that vertex. After computing the costliest edge on the mdzp from each n o n d u m m y vertex in T~, the edges of T' are determined as follows: M a r k as candidates all n o n d u m m y edge of T1 and all edges incident on z. Next, for each vertex v with k children, unmark k edges. The edges we unmark are chosen from the m a x i m u m cost edges on k + 1 descending z-paths Pi from 4 While these methods do not discuss the evaluation of subexpressions, it is not difficult to extend them to compute the value of the subexpression at every vertex. For instance, to use the method of [3], after normal processing the schedule can be run backward in a manner analogous to the expansion phase in [11]. To use the technique in [6], the key modification involves computing the prefix composition [16] of the chain verticesremoved immediatelyafter SuperRake. After all other processingis complete, these composed functions are applied to their arguments.
376
s. Pawagi and O. Kaser
v to z: where P0 is (v, z), and Pi (i _> 1) is from v to its ith child % and thereafter it is the mdzp from % All of these edges except the costliest edge on the mdzp from v are unmarked. Since the costliest edge (say e~) on the mdzp from ci has been computed, the maximum cost edge on pi is thus the costlier of (v, ci) and e i. Each edge is unmarked at most once, so there are no write conflicts when various processors are unmarking edges simultaneously. The marked edges are those of the final MST. Using the Euler-tour [20] technique for correctly orienting edges we obtain a directed tree. The actual functions chosen to compute the costliest edge on the mdzp from each vertex v are: 1. f~ = (v, z) if v is a leaf. 2. f~(x) = mm-cost-edge{(v, z), max-cost-edge{x, (v, vl)}} if v has one child v1. 3. s y ) = min-cost-edge{(v, z), max-cost-edge{x, (v, v0}, max-cost-edge{x, (v, v2)}} if v has two children vl and v2. These functions become the operators of the expression tree that i.s evaluated at every vertex. Note that the edges that will eventually be unmarked are the edges examined but not selected by the min-cost-edge operator in step 2 or 3. Finally, Jung and Mehlhorn note that the expressions over these three operators are suitable for evaluation by techniques such as those cited above. This leads to the following result. THEOREM A.1. On a C R E W P R A M , an M S T can be updated in O(n/p) }ime with p _< n/log n processors if a new vertex is added to the underlying graph. In this section we discuss algorithms for several n-vertex tree (or forest) problems needed for the multiple vertex insertion update. The problems discussed in this section include component identification in a forest, and computing the maximum cost edge on the path from each vertex to the root of its component.
A.2. Related Tree Problems.
LEMMA A.1. On a C R E W P R A M , for an n-vertex tree T, the maximum cost edge on the path from each vertex to the root r can be found in O(n/p) time with p <_ n/qog n processors. PROOF. (Sketch) The general strategy follows the approach used by Gazit et al. [6] to solve the expression evaluation problem optimally. We show time and processor bounds of O(log n) and n/log n, respectively, and note that the more general bounds in the statement of the lemma follow by processor simulation. To solve this problem optimally, the tree is first reduced from n to n/log n Vertices. The reduced problem is solved using a simple pointer-jumping algorithm that uses a processor per vertex. Finally, the original problem is solved, based on the solution to the reduced problem. For more details, see [8].
Optimal Parallel Algorithmsfor Multiple Updates of MSTs
377
The tree size is reduced in two substages. First the number of leaves is reduced. This involves computing the number of descendants of every vertex, using optimal Euler-tour techniques [-3], [20]. Next we determine all maximal subtrees whose sizes are less than log n. Gazit discusses the assignment of subtrees to processors, such that each of the n/log n processors has O(log n) vertices in all its subtrees. Now, for every vertex in its subtrees, a processor computes the maximum cost edge on the path from that vertex to the root of its subtree. The processor simply applies the obvious sequential (linear-time) algorithm to its subtrees. Each processor thus requires O(log n) time. After each processor has completed its task, it deletes its subtrees from T. Let T' be the resultant tree, which has O(n/log n) leaves. However, it may still have | vertices, due to chains [11] of vertices that have only one child. The second substage condenses every chain of T' to a single vertex. It is not difficult to show that T", the resultant tree, has O(n/log n) vertices. The parent of a chain is a vertex that is not a chain vertex, but has a chain vertex as a child. We compute the maximum cost edge on the path from every chain vertex to the parent of its chain, as follows. First identify all chain vertices and the extreme ends of each chain. Number all chain vertices in the tree, in a manner consistent with their preorder numbering in T'. This can be accomplished optimally by listing all vertices of T' in preorder [3], [20], assigning a value of one to chain vertices and zero to others, and finding the prefix sum [3] along this list. Observe that the largest number in each chain belongs to the chain vertex farthest from the root, and that all vertices in a chain are numbered sequentially from top (closest to root) to bottom. Finding the maximum cost edge on the path from every chain vertex to the parent of its chain is an example of a parallel prefix computation on disjoint lists. A deterministic algorithm for this can easily be derived with minor modifications to the optimal list-ranking algorithm in [4] for disjoint lists [3]. This requires O(log n) time and n/log n processors. Now we replace each chain by a chain of length one. The chain vertex farthest from the root (the one with largest number) has had the maximum cost edge on that chain computed for it. Then, in the replacement chain, only this edge is kept. T" is the resultant tree, which has O(n) vertices I-6]. Thus, the first stage is completed. The second stage computes the maximum cost edge on the path from each vertex in T" to the root. Assign a processor to each vertex v in T". Initially, re(v) stores the edge from v to F(v), the parent of v in T". (If r is the root, F(r) = r and rnO') has cost -oo.) Each processor performs the following computation log n times:
re(v) := max-cost-edge{re(v), m(F(v))} F(v) := F(F(v)) /* pointer jumping */ It can be seen that m(v) contains the desired information, after this computation has been performed. The final stage computes the maximum cost edge on the path from v to r, for each vertex v removed in the first stage. First, process the chain vertices that were removed. We know the maximum cost edge on the path from vertex v to p, the parent of v's chain (from the first stage). We also know the maximum cost edge
378
S. Pawagi and O. Kaser
on the path from p to r (from the second stage). Thus, the maximum cost edge from v to r is easily determined as the larger of them. Second, process the subtree vertices removed in the first substage of stage one. Suppose that v is such a vertex, and p is the parent of the (removed) subtree containing v. We know the m a x i m u m cost edge on the path from v to p (computed by the sequential algorithm in the first stage), and we know the m a x i m u m cost edge on the path from p to r. Thus, the larger of these is the m a x i m u m cost edge on the path from v to r. Since the third stage requires constant time with n processors, the overall time bound is O(log n) with n/log n processors. LEMMA A.2. On a C R E W P R A M , for a forest of n vertices, the connected compohems can be determined in O(n/p) time with p <_ n/log n processors. PROOF. We prove this by reducing the problem to that of L e m m a A.I. Computing connected components involves labeling a vertex by the root of its tree. (Two vertices are in the same component iff they have the same label.) First, make a single tree from the forest. It is easy to identify and number the roots of the trees in the forest. Each of the former roots becomes a child of a new super-root that we create. This is done (without write conflicts) by making the root of the first tree, the first child of the super-root. Likewise, the second tree becomes the second child and so forth. The cost of new edges incident on the super-root is + co. N o w perform the maximum-cost-edge computation of L e m m a A.1. Observe that if a vertex v is in a tree with root r, then the edge computed for v is (r, super-root). [] A.3. Computing the Least Cost Copy o f an Edge. The transformation of Section 3 may create several copies of an edge of G~. To compute the minimum cost copy of such an edge of G,, we might consider sorting the O(nk) edges of Gb according to .>_b. This would group the various copies of any edge together, in sorted order. Thus, we could trivially find the least cost copy of every edge. Unfortunately, we have no C R E W sorting algorithm to sort nk objects, whose keys do not falt into a very limited range, in O(log n" log k) time with nk/(log n" log k) processors. Thus, this approach is not useful. Instead, we describe a two-stage algorithm to compute the minimum cost copy of each such edge. This algorithm also determines if a given edge is not transformed to Gb. The first stage of the algorithm considers only edges of Gb that are incident on some new vertex z~. These edges may contain multiple copies of one or more edges of G~ that are transformed to Gb. For every such edge of Gz, its minimum cost copy (incident on zO is determined. This computation is conducted in parallel for all new vertices, computing at most k copies of each edge. The second stage selects the overall minimum cost copy. For the first stage we consider the computation for some new vertex z~. Recall that, in Section 3.2, we copied the components of 7", and made each component a subtree of a tree rooted at z~. Let T~ be this entire tree. Also, recall that each e-vertex v m T~ contributed an edge (zl, v) to Gb, which is a copy of the m a x i m u m cost edge on the i-path from z~ to v. Let re(v) store the maximum cost edge on the
Optimal Parallel Algorithms for Multiple Updates of MSTs
379
i-path from v to zi. The key observation here is that the vertices in T~ with a given value of m( ) form a connected subgraph of Tz. This can be considered a subtree of T~ that may not extend to the leaves. Let (u, w) be the value of re(v) for every vertex v in one such subtree T'i, and let u be the vertex closer to zl. Note that w is the root of T'i. We exploit the subtree structure to find the smallest e-vertex v' in T'i. Then the edge (z~, v') is the minimum cost copy of (u, w) that is incident on z~. Computation of v' (for all such subtrees of Ti) can be modeled as expression evaluation on disjoint trees [3], [6]. All vertices, except e-vertices, are assigned a value of + oo for this computation. Thus, v' will always be an e-vertex (if one exists). If an e-vertex is not present in T'~, then + oc will be computed for T'~. This computation requires O(log n. log k) time with n/(log n. log k) processors, for every new vertex. Since it is done in parallel for each new vertex, the time and processor requirements are O(log n. log k) with nk/(log n ' l o g k) processors. For the second stage we consider the edges of Gz that are incident on new vertices separately from the rest. If (zi, v~) is an edge of G,, it has at least one copy in Gb. (Note that vj is an e-vertex.) Further, all copies of this edge are incident on zz. Thus, the minimum cost copy is easily determined by examining the value computed for T~ in the first stage. On the other hand, suppose that e is an edge of G~ that is not incident on any new vertex. There are at most n - 1 such edges. Let M be a matrix with a row for each such edge, and with a column for each new vertex. Entries of M store edges of Gb, and are initialized to an edge whose cost is + oQ. Copy the values computed in the first stage into M. This copying process is described for z~, and is identical for other new vertices. For every different edge e of Gz that has a copy in Gb with z~ as an endpoint, store the minimum cost such copy into M[i, e]. For every row of M, we find the minimum cost copy of that edge by comparing the entries in that row. We do this in parallel for every row. Note that if some row has only + cc entries, that edge has no copies in G b. This row will have + oe computed for it. By applying Lemma 2 from [2], we find the minimum cost copy of every edge (that is not incident on a new vertex) in O(log n-log k) time with nk/(log n.log k) processors. A.4. M S T o f a Bipartite Graph. In the solution of the multiple vertex insertion update, one step involves constructing the MST of a bipartite graph with partitions of size n and k. Algorithms that compute the MST of an arbitrary graph with O(nk) edges and n + k vertices have time and processor requirements that are too high. Instead, we are able to exploit the bipartite nature of our graph to achieve better resource bounds. The following lemma describes the required result.
LEMMA A.3. Let G = (V~, V2, E) be an undirected connected bipartite graph, where V1 and V2 are two partitions of the set of vertices. Let I Vll = n and tV21 = k. On a C R E W P R A M , an M S T for G can be constructed in O(log n ' l o g k) time using O(nk/(log n .log k)) processors. PROOF. This proof assumes an understanding of the algorithm in E2]. In each iteration, that algorithm first forms supervertices and then the adjacency matrix is contracted. After every iteration the number of vertices that need to be processed
380
s. Pawagl and O. Kaser
is reduced by a factor of two. W i t h o u t loss of generality, s u p p o s e n > k. The first i t e r a t i o n is modified to process b i p a r t i t e graphs. After t h a t the resulting g r a p h has k vertices a n d then we need k2/tog 2 k p r o c e s s o r s a n d O(log 2 k) time to c o m p l e t e the c o m p u t a t i o n . T h e p r o c e s s o r a n d time r e q u i r e m e n t stated in the ! e m m a are necessary just for the first iteration. In the first i t e r a t i o n a m i n i m u m cost edge incident on each vertex in the larger partition, Vt, is determined. It is easy to see t h a t this can be d o n e in O(log k) time using nk/log k processors. These b o u n d s can be further i m p r o v e d to O(log n . l o g k) time a n d nk/(log n -log k) processors (by p r o c e s s o r simulation). T h u s every vertex in V~ is p a i r e d with some vertex of V2, resulting in k trees. E a c h tree is in fact a supervertex, a n d is t r e a t e d as a single vertex after c o n t r a c t i o n of the a d j a c e n c y matrix. It is easy to see that w o r k involved in c o n t r a c t i n g the a d j a c e n c y m a t r i x is p r o p o r t i o n a l to (nk + k2). This can be p e r f o r m e d within the required time and p r o c e s s o r bounds, Its details are s t r a i g h t f o r w a r d a n d are omitted. [] N o t e Added in ProoJl Since this p a p e r was submitted, two other g r o u p s have e x a m i n e d the multiple vertex insertion u p d a t e p r o b l e m . Johnson a n d M e t a x a s also base their w o r k on earlier w o r k by P a w a g i [12], arriving at the same time a n d processor complexities as this paper. Sch/iffer a n d V a r m a n use a different a p p r o a c h , arriving at a slightly faster a l g o r i t h m that is only o p t i m a l when k = O(log n). References to their research reports follow. D. B. Johnson and P. Metaxas. Optimal Parallel and Sequential Algorithms for the Vertex Updating Problem of a Minimum Spanning Tree. Technical Report PCS-TR91-159, Dartmouth College, Hanover, NH, 1991. A. Schiiffer and P. Varman. Parallel Batch Update of Minimum Spanning Trees. Technical Report COMP-TR90-140. Rice University, Houston. TX. November 1990.
References [1] F. Chin and D. Houck, Algorithms for Updating Minimum Spanning Trees, J. Compu!. System Sci., 16 (1978), 333-344. E2] F. Chin, J. Lam, and I. Chen, Efficient Parallel Algorithms for Some Graph Problems, Comm. ACM, 25 (1982), 17(~175. [3] R. Cole and U. Vishkin, The Accelerated Centroid Decomposition Technique for Optimal Parallel Tree Evaluation in Logarithmic Time, Algorithmica, 3 (1988), 329-346. [-4] R. Cole and U. Vishkin, Approximate Parallel Scheduling, Part I, SIAM J. Compute, 17 (!988), 128 142. [5] G. Frederickson, Data Structure for On-line Updating of Minimum Spanning Trees, SIAM J. Comput., 14 (1985), 781 798. [6] H. Gazit, G. L. Miller, and S. H. Teng, Optimal Tree Contraction in the EREW Model, Extended abstract, 1986. [7] H. Jung and K. Mehlhorn, Parallel Algorithms for Computing Maximum Independent Set in Trees and for Updating Minimum Spanning Trees, Technical Report 01, University of Saartandes, Saarbrucken, 1987. [8] O. Kaser and S. Pawagi, Dynamic Tree Contraction and Its Application to Optimal Parallel Updates of M~nimum Spanning Trees, Technical Report 25, Department of ComputerScience, State University of New York, Stony Brook, 1990.
Optimal Parallel Algorithms for Multiple Updates of MSTs
381
~9] J. B. Kruskal, On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem, Proe. Amer. Math. Soc., 7 (1956), 48-50. [10] C.P. Kruskal, L. Rudolph, and M. Snir, Efficient Parallel Algorithms for Graph Problems, Algorithmica, 5 (1990), 43-64. [11] G.L. Miller and J. H. Reif, Parallel Tree Contraction and Its Applications, Proc. Twenty-Fifth Annual Symposium on Foundations of Computer Science, 1985, pp. 478 489. [12] S. Pawagi, A Parallel Algorithm for Multiple Updates of Minimum Spanning Trees, Proc. Eighteenth International Conference of Parallel Processing, 1989, pp. III.9-15. [13] S. Pawagi and I. V. Ramakrishnan, Parallel Update of Graph Properties in Logarithmic Time, Proe. Fourteenth International Conference on Parallel Processing, 1985, pp. 186-193. [14] S. Pawagi and I. V. Ramarkrishnan, An O(log n) Algorithm for Parallel Update of Minimum Spanning Trees, Inform. Process. Lett., 22 (1986), 223-229. [15] R. Prim, Shortest Interconnection Networks and Some Generalizations, Bell System Tech. J., 36 (1957), 1389-1401. [16] S. Rajasekaran and J. H. Reif, Optimal and Sublogarithmic Time Randomized Parallel Sorting Algorithms, SIAM J. Comput., 18 (1989), 594-607. [17] C. Savage and J. Ja'Ja', Fast Efficient Parallel Algorithms for Some Graph Problems, SIAM J. Comput., 10 (1981), 682-691. [18] Y. Shiloaeh and U. Vishkin, An O(log n) Parallel Connectivity Algorithm, J. Algorithms, 3 (1982), 57-67. [19] P. Spira and A. Pan, On Finding and Updating Spanning Trees and Shortest Paths, SIAM J. Comput., 4 (1975), 375-380. [20] R.E. Tarjan and U. Vishkin, Finding Biconnected Components and Computing Tree Functions in Logarithmic Parallel Time, SIAM or. Conput., 14 (1985), 862-874. E21] Y. H. Tsin, On Handling Vertex Deletion in Updating Minimum Spanning Trees, Inform. Process. Lett., 27 (1988), 167-168. [22] Y.H. Tsin, Private communication, 1990. [23] P. Varman and K. Doshi, A Parallel Vertex Insertion Algorithm for Minimum Spanning Trees, Theoret. Comput. Sci., 58 (1988), 379 397.