Annals of Operations Research 123, 265–284, 2003 2003 Kluwer Academic Publishers. Manufactured in The Netherlands.
Optimal Direct and Indirect Covering Trees JUSTIN C. WILLIAMS
[email protected]
Johns Hopkins University, Department of Geography and Environmental Engineering, 3400 North Charles Street, Baltimore, MD 21218, USA
Abstract. The maximal covering subtree problem has applications in transportation network design and extensive facility location. A subtree of a network is a tree that is not a full spanning tree. Finding an optimal subtree may involve the two objectives, minimizing the total arc cost or distance of the subtree, and maximizing the subtree’s coverage of population or demand at nodes. Coverage may be defined as direct or indirect: indirect coverage requires that a node be within a distance threshold s > 0 of the subtree, and direct coverage requires that a node be connected to the subtree (s = 0). Previous approaches to this problem have sought to identify optimal subtrees of a “parent” network that is itself a tree (e.g., a minimum spanning tree). In this paper four new bi-objective zero–one programming models are presented. The first two are models for the problem of finding optimal subtrees on a single spanning tree parent under conditions of (1) direct and (2) indirect coverage. These problems have been addressed previously in the literature. In the third and fourth models, the subtree can be selected from among multiple candidate parent spanning trees simultaneously. The latter models address a new generalization of the first problem and offer both increased flexibility and the potential for a more diverse array of solutions. The models have “integerfriendly” solution properties and are relatively small in terms of the number of decision variables and constraints. Computational experience is reported for a demonstration problem and results are compared to the results of previous models. Keywords: location, covering tree, transportation network, extensive facility
1.
Introduction
Many problems in transportation network design (Magnanti and Wong, 1984) and extensive facility location (Mesa and Boffey, 1996) involve identifying an optimal subnetwork within a larger “backbone” or “parent” network. Frequently the criteria for optimality are the total cost or distance of arcs in the subnetwork, which is to be minimized, and the total demand or population of nodes covered by the subnetwork, which is to be maximized (Current and Min, 1986). The particular type of subnetwork addressed in this paper is a tree (or subtree). A tree has the property of connecting a set of n nodes with n − 1 arcs, the minimum number of arcs needed to provide a connection between every pair of nodes. Trees contain no cycles and provide a unique path between every node pair. Hence, efficient tree structures are often sought when controlling the construction cost of a subnetwork is paramount. Magnanti and Wolsey (1995) review optimal tree problems in applied combinatorial optimization. Several types of related subtree problems have been addressed in the recent network design literature. Two principal types are the maximum direct covering subtree
266
WILLIAMS
problem (Hutson and ReVelle, 1989; Church and Current, 1993) and the maximum indirect covering subtree problem (Hutson and ReVelle, 1993; Church and Current, 1993). In the first, the task is to identify an optimal subtree under the objectives of minimizing the total cost of selected arcs and maximizing the total population that is connected or covered directly. Direct coverage of a node requires the node to be part of the subtree. In the second problem the objectives are the same, but a node is considered covered if it is either part of the subtree (direct coverage) or within a specified distance s of a node of the subtree, where s > 0 (indirect coverage). As Church and Current (1993) point out, direct coverage is a special case of indirect coverage, with the distance threshold s set equal to zero. A third problem has also been studied, the minimum-cost indirect covering subtree problem. Here, the task is to construct a subtree that covers every node of the parent network within a distance s (> 0) of the subtree (Kincaid, Lowe and Morgan, 1988; Kim et al., 1990). Finally, a fourth problem might be added: the minimum direct covering subtree. But this is, of course, the classic minimum spanning tree problem for which the well-known polynomial time algorithms of Kruskal (1956) and Prim (1957) have been developed. Hutson and ReVelle (1989, 1993) also consider related problems such as enlarging an existing subtree network by adding new arcs under direct and indirect coverage requirements. While these problems have been couched in terms of finding an optimal subtree within a general parent network, in the research to date the parent network or (network backbone) has been defined as a spanning tree, that is, a tree that connects all nodes. Requiring that the parent network be a spanning tree instead of a more general network has computational advantages in that the possibility of obtaining a cyclic subnetwork is precluded. However, this requirement is also disadvantageous in that the set of eligible arcs becomes highly restricted. If the requirement is relaxed, the values of the objectives can potentially be improved because the subtree can be selected from an arc set larger than that provided by a single spanning tree. Methods for finding exact solutions to optimal covering subtree problems include specialized algorithms (Kincaid, Lowe and Morgan, 1988; Kim et al., 1990; Church and Current, 1993) and integer programming (Hutson and ReVelle, 1989, 1993; Church and Current, 1993). Hutson and ReVelle (1989, 1993) formulated bi-objective zero–one programs that are used to derive the non-inferior tradeoffs between cost and coverage. These authors used linear programming with a branch and bound routine to find optimal (or non-inferior) integer solutions; some branching and bounding was needed to solve a 35-node demonstration problem. Church and Current (1993), however, report that their bi-criterion zero–one programming model required no branching and bounding when applied to the same problem. The mathematical programming approach is advantageous in that “off-the-shelf” optimization software can be used to find exact solutions, but has the drawback of potentially long computation times for the branching and bounding process. In contrast, special algorithms may be more computationally efficient but require custom coding and may not be amenable to additional constraints such as budget constraints (Church and Current, 1993).
OPTIMAL DIRECT AND INDIRECT COVERING TREES
267
The common limitation of these models to solving subtree problems is that they can be directly employed only when the parent network is a single spanning tree. In this paper, new bi-objective zero–one programs are developed for both the direct and indirect covering subtree problems described above. The restriction that the parent network must be a single spanning tree is relaxed, and the subtree is allowed to be selected from among multiple candidate parent spanning trees simultaneously. This approach provides greater flexibility and the possibility for improved solutions relative to a single spanning tree, as well as a potentially broader range of alternatives. Four new models are presented. The first two models (M1 and M2) address, respectively, direct and indirect coverage on a single spanning tree parent. These models are new formulations for the original problems posed by Hutson and ReVelle (1989, 1993). The third model (M3) extends M1 for application to direct coverage, given multiple candidate parent spanning trees. Similarly, the fourth model (M4) extends M2 for application to indirect coverage with multiple parents. All four models are demonstrated using the 35-node network of Hutson and ReVelle (1989, 1993). Models M1 and M3 are also demonstrated on a 300-node network. Computational results are discussed.
2.
Maximal covering subtrees given a single spanning tree parent
In this section new models are developed for the maximum direct and indirect covering subtree problems using a single spanning tree as the parent network, as posed by Hutson and ReVelle. As shown, the tree structure of the parent network can be exploited to derive relatively compact and “integer-friendly” (ReVelle, 1993) formulations. Consider a network G with node set N (i = 1, . . . , n). Also, consider a spanning tree t of G with node set N and arc set A (figure 1(a)). Node n is designated as the “terminal” node of the spanning tree. The choice of n as the spanning tree’s terminus is arbitrary; any of the nodes may be used. Each arc (i, j ) in A is specified as a directed arc and is directed toward (points to) the spanning tree’s terminus. That is, (i, j ) is directed from node i to node j , where j is closer than i to n along the spanning tree. We note that for any given terminus n, the direction of each arc in the spanning tree is uniquely determined. Directed arcs are needed to formulate the models, but neither limit the generality of the models nor have particular meaning in the solutions. Consequently, the following observation is made, which is central for developing the models. In any subtree of a spanning tree t, there exists a unique node m that can be specified as the subtree’s terminus (figure 1(b)). In particular, the subtree’s terminus is that node m of the subtree which is nearest to the spanning tree’s terminus (node n) as measured along the spanning tree (m and n will be the same node if n is in the subtree). A key aspect of the models is that they ensure subtree connectivity by incorporating constraints that force a sequence of adjacent arcs into the subtree until a subtree terminus is reached.
268
WILLIAMS
Figure 1. Sample spanning trees. (a) Single spanning tree with directed arcs which are directed toward the spanning tree’s terminus (node #7). (b) Spanning tree with a subtree (bold arcs); black nodes are covered directly and the subtree terminus is node #4. (c) Two spanning trees for the same set of nodes: (i) black arcs and (ii) dashed arcs. Both spanning trees have the same terminus (7). Arc (6, 7) is common to both trees, while arcs (3, 4) and (4, 3) are treated as separate arcs.
2.1. Notation for models M1, M2, M3, and M4 Indices and sets i, j , k, N indices and set of nodes (i, j , k = 1, . . . , n); the set of eligible terminal nodes for the subtree. N includes all the nodes in N N except “leaf” nodes of the parent spanning tree, that is, nodes incident to only one arc; however, the spanning tree’s terminus (node n) is eligible even if it is a leaf node; the set of nodes i that are within distance standard s of node k (Sk contains k); Sk (i, j ), A index and set of candidate directed arcs in the parent spanning tree; arc (i, j ) is directed from node i to node j ; the set of spanning tree arcs (i, j ) that are incident to and directed into Pj node j . Parameters cij the cost or length of arc (i, j ); ai the demand or population of node i if i is covered directly; ai the demand or population of node i if i is covered indirectly (ai ai ). Decision variables xij = 1, if directed arc (i, j ) and node i are selected for the subtree, and = 0, otherwise; yj = 1, if node j is selected as the terminus of the subtree, and = 0, otherwise; uk = 1, if node k is covered either directly or indirectly by the subtree, and = 0, otherwise.
OPTIMAL DIRECT AND INDIRECT COVERING TREES
269
2.2. Maximum direct covering tree model – single spanning tree parent (M1) Formulation M1
Minimize Z1 =
maximize Z2 =
ai xij +
xij − xj k − yj 0 yj = 1, j ∈N
yj −
aj yj ,
(1b)
for all arcs (i, j) ∈ A,
(1c)
j ∈N
(i, j )∈A
subject to
(1a)
cij xij ,
(i, j )∈A
xij 0
(1d) for all nodes j ∈ N ,
(1e)
for all decision variables.
(1f)
(i, j )∈Pj
xij , yj ∈ {0, 1}
Description of formulation M1 Objective (1a) is to minimize the total cost of the subtree, which is the arc cost summed over all selected arcs. Objective (1b) is to maximize the total node population covered directly by the subtree. Note that if arc (i, j ) is selected, then the population of node i is covered. The population of j would also be covered, but to avoid double counting only the population of an arc’s origin node is used in conjunction with the arc. In the second term of the equation, the population of the subtree terminus, which is covered but is not counted in conjunction with any arc, is added to the total. Constraints (1c) ensure that if an arc (i, j ) is selected for the subtree then either the unique adjacent arc (j , k) will also be selected or node j will be selected as the subtree’s terminus. Note that (j , k) is adjacent to and follows (i, j ) on the path from i to the parent spanning tree’s terminal node n. A separate constraint is written for each arc (i, j ) in the parent tree. For arcs (i, n) incident to the parent’s terminus, the constraint would be written as xin − yn 0. Constraint (1d) specifies that exactly one node will be selected as the subtree’s terminus. As stated above, only non-leaf nodes are included in the set of eligible subtree termini (with the exception of node n, which is included in N even if it is a leaf). Constraints (1c) and (1d) together ensure connectivity of the subtree. Constraints (1e) require that if node j is selected as the subtree’s terminus, then at least one arc on the parent spanning tree incident to j and directed into j must also be selected. These constraints ensure that at least one arc will be selected for the subtree. A separate constraint is written for each eligible terminus j . Constraints (1f) require each decision variable to take on a value of zero or one. Figure 1(b) shows a hypothetical subtree that satisfies the constraints in model M1. Depending on the structure of the parent spanning tree, this model may contain up to 2n − 1 binary decision variables and 2n structural constraints, where n is the number
270
WILLIAMS
of nodes in the parent spanning tree. The formulation is slightly larger than Church and Current’s (1993) model (2n − 2 variables and n − 1 constraints), but much smaller than Hutson and ReVelle’s (1989) model (2n − 1 variables, and between n2 − 1 and 1/6(n3 −n) constraints). The potential advantage of model M1 over the model of Church and Current is that less pre-processing is likely to be necessary. The model of Church and Current uses a “pruning” process to create the subtree and requires prior identification/enumeration of all arc sets that may be pruned from the parent spanning tree. 2.3. Maximum indirect covering tree model – single spanning tree parent (M2) In order to address indirect coverage, formulation M1 is modified by adding new decision variables uk (defined above) for nodes that are covered indirectly, and new constraints (2f) that enforce conditions of indirect coverage. Formulation M2
Minimize Z1 = maximize Z2 =
(2a)
cij xij ,
(i, j )∈A
ai ui +
i∈N
ai − ai xij + aj − aj yj , (i, j )∈A
subject to xij − xj k − yj 0 yj = 1 j ∈N
yj −
uk −
i∈Sk
for all arcs (i, j ) ∈ A,
xij −
(2c) (2d)
xij 0
(i, j )∈Pj
(2b)
j ∈N
yj 0
for all nodes j ∈ N ,
(2e)
for all nodes k ∈ N,
(2f)
for all decision variables.
(2g)
j ∈Sk
xij , yj , uk ∈ {0, 1}
Description of formulation M2 Objective (2a) and constraints (2c)–(2e), and (2g) have the same meaning as objective (1a) and constraints (1c)–(1f) in formulation M1, respectively. Objective (2b) has been restructured to maximize the combined direct plus indirect coverage of population. For accounting purposes, the decision variables are defined so that direct coverage of a node implies indirect coverage. That is, if node i is directly covered (is part of the subtree), the variables xij and ui will both equal 1; but if node i is just indirectly covered (not part of the subtree but within distance s of a node of the subtree), then only ui will equal 1. The coefficients are defined accordingly so as to avoid double counting of covered population. Constraints (2f) state that the population of node k can be covered indirectly only if either the origin node of a selected arc is within distance s of k or the subtree terminus is within s of k. As stated above, k is included in the set Sk ; hence, for any node k that
OPTIMAL DIRECT AND INDIRECT COVERING TREES
271
is selected for the subtree (xkj = 1 or yk = 1), objective (2b) will force k to be selected for indirect coverage as well (uk = 1). A separate constraint is written for each node k. Formulation M2 contains at most 3n−1 binary decision variables and 3n structural constraints. 2.4. Computational experience with models M1 and M2 Hutson and ReVelle applied their models to a 35-node network. Populations or demands were specified for each node. Using inter-node costs or distances from their distance matrix (Hutson and ReVelle, 1993, appendix A), they identified a minimum spanning tree (MST), which was used as the parent tree (figure 2). For indirect coverage, the distance standard s was 500, which was applied to actual inter-node costs or distances in the matrix. Direct and indirect population coverage were considered equivalent, that is, ai was equivalent to ai for all nodes i. They used the NISE method, an efficient form of the multiobjective weighting method (Cohon, 1978), to generate non-inferior solutions. A weighted combination of the two objectives was optimized: minimize w1 Z1 − w2 Z2 , where w1 was the weight on objective 1 (cost) and w2 was the weight on objective 2 (covered population). A negative sign was associated with Z2 because this objective was maximized within a combined objective that was minimized. Alternative non-inferior solutions were generated by systematically varying the weights. For the direct coverage problem, Hutson and ReVelle found seven non-inferior solutions that, together, approximated the tradeoff curve between cost and coverage: the two extreme solutions of no coverage and complete coverage, plus five other intermediate solutions. Under indirect coverage, five solutions were found, including the two extremes.
Figure 2. Candidate spanning tree parent #1 (minimum spanning tree) with arc directions shown (after (Hutson and ReVelle, 1989, 1993)). Node #35 is the spanning tree’s terminus.
272
WILLIAMS
Table 1 Formulation M1, direct coverage. Single spanning tree parent (spanning tree #1, minimum spanning tree). Solutiona
w1b
w2c
Total costd
Coste (%)
Total coverf
Coverg (%)
CPUh (sec)
D E F G H I J
1.0 8.9 3.1 2.5 2.3 1.7 0
0 1.0 1.0 1.0 1.0 1.0 1.0
0 239 1.855 3,057 5,743 6,455 9,686
0 2.47 19.15 31.56 59.29 66.64 100
0 10, 184.9 16, 587.7 20, 146.6 26, 668.8 28, 012.5 30, 054.0
0 33.89 55.19 67.03 88.74 93.21 100
– 0.02 0.01 < 0.01 < 0.01 0.01 –
# iteri
# nodesj
– 24 12 3 2 2 –
– 0 0 0 0 0 –
a Noninferior solutions are identified by letter, as in (Hutson and ReVelle, 1989, 1993). b Value of the weight used on objective 1 (cost objective). c Value of the weight used on objective 2 (coverage objective). d Total cost of the subtree. Bold value indicates the total cost or length of the minimum spanning tree. e Total cost of the subtree as a percentage of the total cost of the minimum spanning tree. f Total coverage of the subtree. g Total coverage of the subtree as a percentage of the total population of the network. h Computer run time in CPU seconds. i Number of simplex iterations required. j Number of branch nodes required in the branch and bound routine.
Table 2 Formulation M2, indirect coverage. Single spanning tree parent (spanning tree #1, minimum spanning tree). Solution
w1
w2
Total cost
Cost (%)
Total cover
Cover (%)
CPU (sec)
# iter
# nodes
A B C D E
1.0 14.8 4.4 0.6 0.01
0 1.0 1.0 1.0 1.0
0 37 1,441 5,301 6,787
0 0.38 14.88 54.73 70.01
0 11, 164.0 21, 333.9 29, 578.7 30, 054.0
0 37.15 70.79 98.42 100
– 0.31 0.33 0.16 0.07
– 708 646 243 129
– 47 64 38 8
On explanation of the columns see footnotes of table 1.
Formulations M1 and M2 were applied to the same 35-node network in an attempt to duplicate the results of Hutson and ReVelle. The same objective function weights were used, and the same solutions were obtained under both direct and indirect coverage requirements (tables 1 and 2; figures 3 and 4). The direct covering model (M1) was run as a relaxed linear program (i.e., without the integer requirements for decision variables) and yielded integer-optimal solutions in each case; no branching and bounding was necessary (table 1, column 10). The constraint matrix may therefore be unimodular (Nemhauser and Wolsey, 1988), although proof of unimodularity is beyond the scope of this paper. In the indirect covering model (M2), the integer requirement was enforced for the xij variables; the uk variables were allowed to remain continuous but were given an upper bound of 1; and the only restriction on the yj variables was nonnegativity. Finding
OPTIMAL DIRECT AND INDIRECT COVERING TREES
273
Figure 3. Tradeoff curve for direct coverage with multiple parent spanning trees. Black circles indicate solutions to model M1 on the minimum spanning tree (tree #1); these solutions were also found by Hutson and ReVelle (1989). Solutions for the other candidate spanning trees (trees #2– #5) are shown in boxes.
integer-optimal solutions to M2 required modest amounts of branching and bounding (table 2, column 10). In each case the solution time was less than one second (table 2, column 8). These models were run on a Silicon Graphics Indigo workstation using the CPLEX version 3.0 mixed-integer program solver. 3.
Maximum covering subtrees given multiple parent spanning trees
In this section models M1 and M2 are extended so that an optimal subtree can be selected from not one, but from multiple candidate parent spanning trees simultaneously. As in the case of the single spanning tree, multiple spanning trees are created exogenously, in advance of running the model. In finding an optimal solution, the model simultaneously selects both the parent spanning tree and the subtree of which it is part. Candidate parents should be generated or selected based on suitability for the problem at hand. Candidate parents might be generated in any number of ways, including “best guess”
274
WILLIAMS
Figure 4. Tradeoff curve for indirect coverage with multiple parent spanning trees. Black circles indicate solutions to model M2 on the minimum spanning tree (tree #1); these solutions were also found by Hutson and ReVelle (1993). Solutions for the other candidate spanning trees (trees #2– #5) are shown in boxes.
approaches based on arc costs and node populations, or multiple applications of a modified minimum spanning tree algorithm for finding good alternatives to the MST. The number of candidate parents to include is up to the user; a larger number of candidates results in a larger model but may yield better solutions. Each candidate parent should have unique aspects to its structure so that potential good linkages not found in other spanning trees may be obtained. Under multiple spanning trees, the network construct is essentially the same as that used for a single spanning tree, although some minor changes have been made. As before, each arc in each of the candidate parents is directed toward the terminal node n, where (without loss of generality) n is used as the common terminus for each of the parent spanning trees (figure 1(c)). Note that some arcs may appear in more than one tree, and that the direction of an arc may be different in different trees. That is, arc (i, j ) appears in one tree and arc (j , i) appears in another; a separate decision variable would be defined for each arc (xij and xj i ).
OPTIMAL DIRECT AND INDIRECT COVERING TREES
275
3.1. Additional notation for models M3 and M4 Additional indices and sets t, T index and set of candidate parent spanning trees; the set of spanning trees t that contain directed arc (i, j ); Tij for models M3 and M4 we revise the definition of N ; N is again the set of eligible N terminal nodes for the subtree, but it is now defined as the union of the sets of nonleaf nodes, taken over all candidate parent spanning trees t; A for models M3 and M4 we revise the definition of A; A is again the set of candidate directed arcs, but it is now defined as the union of directed arc sets, taken over all candidate parent spanning trees t; for models M3 and M4 we revise the definition of Pj ; Pj is now the set of directed Pj arcs (i, j ) that are incident to and directed into node j in any of the candidate parent spanning trees t; the set of arcs (j , k) that are incident to and directed out of node j in any of the Fj candidate parent spanning trees t. Additional decision variables vt = 1, if spanning tree t is selected as the parent of the subtree, = 0, otherwise.
and
3.2. Maximum direct covering tree model – multiple parent spanning trees (M3) Formulation M3
Minimize Z1 =
(i, j )∈A
maximize Z2 =
ai xij +
(3b)
aj yj ,
j ∈N
(i, j )∈A
subject to xij −
(3a)
cij xij ,
xj k − yj 0 for all arcs (i, j ) ∈ A,
(3c)
(j, k)∈Fj
yj = 1,
j ∈N
yj −
(3d) xij 0
for all nodes j ∈ N,
(3e)
for all arcs (i, j ) ∈ A,
(3f)
(i, j )∈Pj
xij −
vt 0
t ∈Tij
vt = 1,
(3g)
t ∈T
xij , yj , vt ∈ {0, 1}
for all decision variables.
(3h)
276
WILLIAMS
Description of formulation M3 Objectives (3a) and (3b) are the same as (1a) and (1b) in M1, respectively, except that costs and directly covered populations are now summed over the entire arc set A. Analogous to (1c), constraints (3c) stipulate that if arc (i, j ) is selected for the subtree, then either an adjacent arc (j , k) on one of the candidate spanning trees will also be selected, or node j will be selected as the subtree’s terminus. Note that each of the arcs (j , k) in the second term of the left-hand side follows (i, j ) on the path from i to terminal node n in at least one candidate spanning tree. A separate constraint is written for each arc (i, j ) in arc set A. As in (1d), constraint (3d) specifies that exactly one eligible node will be selected as the subtree’s terminus; (3c) and (3d) together ensure connectivity of the subtree. As in (1e), constraints (3e) ensure that at least one arc will appear in the subtree by requiring that if node j is selected as the subtree’s terminus, then at least one arc incident to j and directed into j must also be selected. This arc(s) may be selected from any of the candidate parent spanning trees in which j is an eligible terminus. A separate constraint is written for each j in N . Constraints (3f) stipulate that an arc (i, j ) can be selected only if one of the candidate parent spanning trees containing that arc is selected. Constraint (3g) requires that exactly one candidate spanning tree be selected as the subtree’s parent. Constraints (3h) specify that each decision variable must take on a value of zero or one. An upper bound estimate on the size of formulation M3 is |T |(n − 1) + n + |T | binary decision variables and 2|T |(n − 1) + n + 2 structural constraints, where |T | is the number of candidate spanning tree parents included in the model. 3.3. Maximum indirect covering tree model – multiple parent spanning trees (M4) To address indirect coverage under multiple spanning trees, formulation M3 is modified by adding the uj decision variables for indirect coverage of nodes and adding constraints that enforce conditions of indirect coverage. These modifications are analogous to those used in model M2. Formulation M4
Minimize Z1 = maximize Z2 =
i∈N
subject to xij − j ∈N
(4a)
cij xij ,
(i, j )∈A
ai ui +
ai − ai xij + aj − aj yj , (i, j )∈A
xj k − yj 0
(4b)
j ∈N
for all arcs (i, j ) ∈ A,
(4c)
(j, k)∈Fj
yj = 1,
(4d)
OPTIMAL DIRECT AND INDIRECT COVERING TREES
yj −
277
xij 0
for all nodes j ∈ N ,
(4e)
for all arcs (i, j ) ∈ A,
(4f)
(i, j )∈Pj
xij −
vt 0
t ∈Tij
vt = 1
t ∈T
uk −
(4g) xij −
i∈Sk (i, j )∈Fi
xij , yj , uk , vt ∈ {0, 1}
yj 0
for all nodes k ∈ N,
(4h)
for all decision variables.
(4i)
j ∈Sk
Description of formulation M4 Objectives (4a) and (4b) are the same as (2a) and (2b) in M2, respectively, except that costs and populations are summed over the entire arc set A, as in M3. Constraints (4c)–(4g) are identical to constraints (3c)–(3g), respectively, in formulation M3, and constraints (4i) are analogous to (3h). Constraints (4h), which enforce indirect coverage conditions, are analogous to (2f) in formulation M2. Note that each node i within distance s of node k may be the origin node of multiple arcs (potentially one arc in each candidate spanning tree). In the second term of the left-hand side the summation is over all such arcs. An upper-bound estimate on the size of formulation M4 is |T |(n − 1) + 2n + |T | binary decision variables and 2|T |(n − 1) + 2n + 2 structural constraints. 3.4. Computational experience with models M3 and M4 We extended Hutson and ReVelle’s demonstration problem so that it would be applicable to models M3 and M4. Using the same set of 35 nodes and their associated populations, five candidate parent spanning trees were developed. The MST (spanning tree #1) was retained, and four new spanning trees were constructed (spanning trees #2– #5) which seemed to be competitive with the MST. The arc costs/ distances of these new spanning trees were taken directly from the original distance matrix. The total arc cost of each tree was within 22 percent of the cost of the MST (table 5, column 8). Under direct coverage (model M3), the sets of objective function weights used in model M1 were used again. Of the five intermediate solutions (E, F , G, H , I ), four were repeats of solutions found in M1 because the parent selected for the subtree was the MST (table 3, column 11). In solution F a different parent was selected, spanning tree #4 (figure 5). The new optimal subtree identified in F is non-inferior, of course, but does not dominate the corresponding solution F of the MST (table 1, row 3). It performs well, though, with a cost that is 27% lower and coverage that is 9% lower than the cost and coverage of F in M1. Note that solution E, which contains just three arcs, was not unique to the MST, but could have used any of the five candidates as a parent. In solution J , which provided direct coverage for all nodes, the MST was selected as the parent, as expected. In solving this model, only the vt variables were declared in-
278
WILLIAMS
teger. Modest amounts of branching and bounding were needed to find integer-optimal solutions (table 3, column 10). Under indirect coverage (model M4), the same objective function weights were used as in M2 and the distance standard was again 500. Of the four non-trivial solutions (B, C, D, E), two (B, E) were repeats of solutions found in M2, and two (C, D) were new solutions (table 4). Solution B, which contains just a single arc, was common to all five candidate parents. Solution C, whose parent was spanning tree #5 (figure 6), did not dominate C in M2 but did provide a cost savings of 5.6% while reducing coverage by only 0.04%. However, solution D on spanning tree #2 (figure 7) did dominate D in M2, with a cost reduction of 1% and improved coverage of 0.2%. In solving M4, only the xij and vt variables were declared integer; the uk variables were allowed to Table 3 Formulation M3, direct coverage. Multiple parent spanning tree parents (spanning trees #1– #5). Solution
w1
w2
Total cost
Cost (%)
Total cover
Cover (%)
CPU (sec)
# iter
# nodes
Span treea
D E F G H I J
1.0 8.9 3.1 2.5 2.3 1.7 0
0 1.0 1.0 1.0 1.0 1.0 1.0
0 239 1,348 3,057 5,743 6,455 9,686
0 2.47 13.92 31.56 59.29 66.64 100
0 10, 184.9 15, 094.1 20, 146.6 26, 668.8 28, 012.5 30, 054.0
0 33.89 50.22 67.03 88.74 93.21 100
– 0.04 0.08 0.09 0.08 0.07 –
– 34 102 116 114 117 –
– 2 5 7 7 5 –
– 1, 2, 3, 4, 5 4 1 1 1 1
a Spanning tree selected as the parent for the given subtree. On explanation of the others columns see
footnotes of table 1.
Figure 5. Candidate spanning tree parent #4 shown with solution F (subtree with bold arcs) for model M3. Black nodes are covered directly.
OPTIMAL DIRECT AND INDIRECT COVERING TREES
279
Table 4 Formulation M4, indirect coverage. Multiple parent spanning tree parents (spanning trees #1– #5).a Soln
w1
w2
Total cost
Cost (%)
Total cover
Cover (%)
CPU (sec)
# iter
# nodes
Span treea
A B C D E
1.0 14.8 4.4 0.6 0.01
0 1.0 1.0 1.0 1.0
0 37 1,361 5,251 6,787
0 0.38 14.05 45.21 70.01
0 11, 164.0 21, 326.0 29, 640.3 30, 054.0
0 37.15 70.96 98.62 100
– 1.13 2.99 2.90 0.92
– 1,811 3,774 3,247 1,117
– 96 279 278 65
– 1, 2, 3, 4, 5 5 2 1
a On explanation of the columns see tables 1 and 3.
Figure 6. Candidate spanning tree parent #5 shown with solution C (subtree with bold arcs) for model M4. Black nodes are covered directly, ringed nodes are covered indirectly.
remain continuous but were given an upper bound of 1; and the only requirement for the yj variables was non-negativity. Finding integer-optimal solutions to M4 required moderate amounts of branching and bounding (table 4, column 10). 3.5. Additional computational experience We note that the results obtained by models M3 and M4 can also be obtained by applying models M1 and M2 (or analogous models developed elsewhere) to each spanning tree separately and then merging the results. In order to test the relative efficiency of these two alternative approaches, we applied models M1 and M3 to a new 300-node network. The Cartesian coordinates of each node were generated randomly, and inter-node Euclidean distances were used as arc lengths. Twenty candidate parent spanning trees were created – the MST plus 19 randomly generated spanning trees.
280
WILLIAMS
Figure 7. Candidate spanning tree parent #2 shown with solution D (subtree with bold arcs) for model M4. Black nodes are covered directly, ringed nodes are covered indirectly. Node #32 is the subtree’s terminus.
Model M1 was applied to the MST and 15 non-inferior solutions were identified using CPLEX. The total computing time needed was 0.88 seconds (0.06 seconds per solution, on average). Multiplying this value by 20 gives us 17.6 seconds, which is the approximate total CPU time that would be needed to find 15 non-inferior solutions for all 20 spanning trees (300 solutions in total). A comparison of solution times between the MST and one of the other spanning trees indicated that solution times did not vary significantly with the spanning tree used as a parent. No branching and bounding was required to find integer optimal solutions. M1, as applied to the MST, contained 535 variables and 563 constraints. In model M3, which incorporated all 20 candidate spanning trees, the total time needed to find 15 non-inferior solutions was 312.58 seconds (21 seconds per solution, on average). The amount of branching and bounding ranged from zero nodes to 38 nodes per solution. The total computing time for M3 represents an 18-fold increase relative to the time for M1. M3 contained 1,876 variables and 3,413 constraints. 4.
Discussion and conclusions
In this paper, four new zero–one programming models are presented for maximal covering subtree problems under both direct and indirect coverage requirements. These problems have applications in transportation network design and extensive facility location. The first two models (M1 and M2) address the problem of finding an optimal subtree on a single spanning tree parent; this problem has been treated previously in the literature. The third and fourth models (M3 and M4) represent new extensions of the
OPTIMAL DIRECT AND INDIRECT COVERING TREES
281
first two models and allow subtrees to be selected from among multiple spanning tree parents. The purpose of this paper is to demonstrate that the use of multiple spanning trees: • offers the potential for improvement over a single tree, even if that tree is the MST, and • provides more flexibility and the potential for identifying a broader range of alternatives. To these ends, the new models are compared to previous models by way of a common example problem. No attempt is made to quantify, in a statistical sense, the likely or expected amount of improvement in general applications of the models. Indeed, the amount of improvement realized would depend on the particulars of the network (node populations or demands, and arc costs or distances) and on the architectures of the candidate spanning trees. As noted above, the results of model M3 (or M4) can be obtained by applying M1 (or M2) to each candidate spanning tree separately. In fact, as the comparison in section 3.5 indicates, the total time needed to find non-inferior solutions is likely to be less for M1 than for M3. However, these results did not account for the additional time that would be needed in M1 to create and manage a separate problem file for each spanning tree. Additional time would also needed in M1 to check each non-inferior solution for “global” non-inferiority; that is, to “merge” the separate tradeoff curves into a single global tradeoff curve. Hence, models M3 and M4 may be more convenient to use than models M1 and M2, especially when many candidate spanning trees are involved, because this merging is done implicitly in M3 and M4. Computational experience with 35-node and 300-node networks indicates that, for direct covering problems, integer-optimal solutions can be found with little or no branching and bounding; that is, solving the problem as a relaxed linear program (without the integer requirements) frequently yields integer-optimal solutions. Indirect covering problems required low to moderate amounts of branching and bounding for the 35-node network. Based on this limited experience, these models compare favorably with previous formulations. The results of the 35-node problem suggest that the minimum spanning tree is a fairly robust parent in that it is likely to contain many non-inferior subtrees. This tends to be the case more so at higher levels of coverage than at lower levels. At lower levels of coverage, a cluster of large-population nodes may exist that can be covered by a relatively small subtree whose parent is not the MST. That is, subtrees of spanning trees other than the MST can potentially cover “hot spots” which do not appear in the MST. At the other extreme, when complete coverage is required, the MST is optimal. Hence, as the level of coverage approaches full coverage, the ratio of coverage to cost in an optimal solution approaches that of the MST, and the choice of the parent spanning tree is more likely to be the MST. This generalization may be less true for indirect coverage than for direct coverage, because under indirect coverage complete coverage of all nodes does not typically require the entire spanning tree.
282
WILLIAMS
Table 5 Direct coverage, individual candidate parent spanning trees. Span treea
Objb
Ec
Fc
Gc
Hc
Ic
Jc
2
Total cost % tot cost Total coverage % tot cover Total cost % tot cost Total coverage % tot cover Total cost % tot cost Total coverage % tot cover Total cost % tot cost Total coverage % tot cover
239 2.47 10, 184.9 33.89 239 2.47 10, 184.9 33.89 239 2.47 10, 184.9 33.89 239 2.47 10, 184.9 33.89
1, 508 15.57 15, 094.1 50.22 1, 678 17.32 15, 758.8 52.43 1, 348 13.92 15, 094.1 50.22 1, 358 14.02 1, 5094.1 50.22
3, 507 36.21 20, 395.2 67.86 4, 484 46.29 22, 870.5 76.10 1, 556 16.06 15, 657.1 52.10 44, 09 45.52 23, 495.6 78.18
5, 771 59.58 25, 795.9 85.83 6, 308 65.12 27, 168.8 90.40 3, 380 34.90 19, 955.4 66.40 4, 762 49.16 24, 371.1 81.09
6, 158 63.58 26, 474.9 88.11 6, 695 69.12 27, 847.8 92.66 6, 335 65.40 25, 758.9 85.71 6, 221 64.23 27, 041.0 89.97
10, 822 111.73 30, 054 100 10, 805 111.55 30, 054 100 11, 793 121.75 30, 054 100 10, 570 109.13 30, 054 100
3
4
5
a Spanning tree used as the parent. b Objective functions (cost and coverage), in absolute number and percentage of total. c Objective function values for solutions E–J . Total cost is shown as a percentage of the minimum spanning
tree cost. Bold values indicate total cost or length of the spanning tree.
In other runs of M3 and M4 the parent was restricted to one of the four new spanning trees (in effect, applying M1 and M2 to each new tree separately). Using the same objective function weights, non-inferior solutions were generated for each tree (tables 5 and 6). Many of these solutions lie slightly to the “southeast” of the minimum spanning tree’s tradeoff curves, that is, below the convex hulls as approximated by the solutions for M1 and M2 (figures 3 and 4). These solutions would not be identified (even if they were non-inferior) under any set of weights when all five spanning trees were eligible for selection. However, some of the solutions lie slightly to the “northwest” and either were or could possibly have been found when all five spanning trees were eligible. Solution F in tree #4 for direct coverage and solutions C in tree #5 and D in tree #2 for indirect coverage were found, as indicated in section 3.4. In addition, solutions G and H in tree #5 for direct coverage and solution C in tree #4 for indirect coverage could potentially have been found by using different weights. This is important because wide gaps may exist in the tradeoff curve between consecutive solutions for one spanning tree, and solutions from other spanning trees may be able to help fill such gaps. Hence, employing multiple candidate parents may result in a tradeoff curve with a finer gradation of alternatives. The results of Hutson and ReVelle indicate that non-inferior subtrees on a single parent tend to be “nested” in the sense that a larger subtree (a subtree giving greater coverage) can frequently be obtained from a smaller subtree by simply adding appropriate arcs. Hence, the qualitative differences between solutions may be small, as they may
OPTIMAL DIRECT AND INDIRECT COVERING TREES
283
Table 6 Indirect coverage, individual candidate parent spanning trees. Span treea
Objectiveb
Bc
Cc
Dc
Ec
2
Total cost % tot cost total coverage % tot cover Total cost % tot cost Total coverage % tot cover Total cost % tot cost Total coverage % tot cover Total cost % tot cost Total coverage % tot cover
37 0.38 11, 164.0 37.15 37 0.38 11, 164.0 37.15 37 0.38 11, 164.0 37.15 37 0.38 11, 164.0 37.15
809 8.35 17, 398.7 57.89 929 9.59 17, 398.7 57.89 601 6.20 17, 705.5 58.91 1, 361 14, 05 21, 326.0 70.96
5, 251 54.21 29, 640.3 98.62 4, 921 50.81 28, 514.6 94.88 6, 105 63.03 29, 640.3 98.62 5, 147 53.14 29, 570.8 98.39
7, 225 74.59 30, 054 100 7, 995 82.54 30, 054 100 7, 511 77.54 30, 054 100 7, 664 79.12 30, 054 100
3
4
5
a Spanning tree used as the parent. b Objective functions (cost and coverage), in absolute number and percentage of total. c Objective function values for solutions B–E. Total cost is shown as a percentage of the minimum spanning
tree cost.
have many arcs in common. This is not the case when there are many candidate parents; arcs that do not exist in one subtree may be ideal choices in another subtree (e.g., compare figures 6 and 7). Hence, under multiple candidate parents, there is greater potential for strong qualitative differences in the architecture of non-inferior subtrees, in addition to quantitative differences in objective function values. Hutson and ReVelle also formulated models that allow new arcs to be added to an existing subtree. This, too, can be accomplished in the models presented here by adding constraints that set the appropriate decision variables equal to one: set xij equal to one if arc (i, j ) is part of the existing subtree. If the goal of the analysis is to optimize the marginal increases in cost and covered population resulting from new arcs, then the decision variables associated with existing arcs can be removed from the objective functions. Like previous maximal covering tree models, the models in this paper are static in the sense that the possibility of growth (or shrinkage) of the subtree in the future does not influence the model’s choice of an optimal subtree. For example, consider a hypothetical proposed storm sewer extension. Some segments of the sewer extension might be built this year, while construction of the remaining segments would be deferred. Under such a scenario, there may be a need to trade off the costs and benefits realized now versus those realized in the future, and this tradeoff would have direct implications for the selection of a parent spanning tree. The minimum spanning tree has the advantage of providing the
284
WILLIAMS
lowest (undiscounted) cost, over the life of a project, given that all nodes will eventually be linked. But it may entail higher costs early on, relative to other spanning trees, in order to connect a given subset of the population. In contrast, other spanning trees might have lower costs now but would have higher total costs later for connecting the entire population. The choice of parent spanning tree, which serves as a blueprint or backbone for future construction, is important in the dynamic context because it is a decision that may be difficult, costly, or impossible to change once a project has started. In some cases, however, a small non-inferior subtree may be common to multiple (or even all) candidate parents (e.g., solution E in table 3 and solution B in table 4). This type of result is fortuitous for long-range planning because in constructing such a subtree, the selection of a parent tree can be deferred until the subtree needs to be expanded. Hence, options are preserved. Acknowledgments The author would like to thank Charles ReVelle and two anonymous referees for their helpful comments and suggestions. References Church, R. and J. Current. (1993). “Maximal Covering Tree Problems.” Naval Research Logistics 40, 129– 142. Cohon, J.L. (1978). Multiobjective Programming and Planning. New York: Academic Press. Current, J. and H.K. Min. (1986). “Multiobjective Design of Transportation Networks: Taxonomy and Annotation.” European Journal of Operational Research 26, 187–201. Hutson, V.A. and C.S. ReVelle. (1989). “Maximal Direct Covering Tree Problems.” Transportation Science 23(4), 288–299. Hutson, V.A. and C.S. ReVelle. (1993) “Indirect Covering Tree Problems on Spanning Tree Networks.” European Journal of Operational Research 65, 20–32. Kim, T.U., J.T. Lowe, J.E. Ward, and R.L. Francis. (1990). “A Minimum Length Covering Subtree of a Tree.” Naval Research Logistics 18, 309–326. Kincaid, R.K., T.J. Lowe, and T.L. Morgan. (1988). “The Location of Central Structures in Trees.” Computers and Operations Research 15(2), 103–113. Kruskal, J.B. (1956). “On the Shortest Spanning Tree of a Graph and the Traveling Salesman Problem.” Proceedings of the American Mathematical Society 7, 48–50. Nemhauser, G.L. and L.A. Wolsey. (1988). Integer and Combinatorial Optimization. New York: Wiley. Magnanti, T.L. and R.T. Wong. (1984). “Network Design and Transportation Planning: Models and Algorithms.” Transportation Science 18, 1–55. Magnanti, T.L. and L.A. Wolsey. (1995). “Optimal Trees.” In: M.O. Ball, T.L. Magnanti, C.L. Monma, and G.L. Nemhauser (eds.), Network Models. Amsterdam: Elsevier. Mesa, J.A. and T.B. Boffey. (1996). “A Review of Extensive Facility Location in Networks.” European Journal of Operational Research 95, 592–603. Prim, R.C. (1957). “Shortest Connection Networks and Some Generalizations.” The Bell System Technical Journal 36, 1389–1401. ReVelle, C. (1993). “Facility Siting and Integer-Friendly Programming.” European Journal of Operational Research 65, 147–158.