J Intell Robot Syst (2007) 50:85–118 DOI 10.1007/s10846-007-9150-0
Coalition Formation: From Software Agents to Robots Lovekesh Vig · Julie A. Adams
Received: 24 September 2006 / Accepted: 1 February 2007 / Published online: 16 March 2007 © Springer Science + Business Media B.V. 2007
Abstract A problem that has recently attracted the attention of the research community is the autonomous formation of robot teams to perform complex multirobot tasks. The corresponding problem for software agents is also known in the multi-agent community as the coalition formation problem. Numerous algorithms for software agent coalition formation have been provided that allow for efficient cooperation in both competitive and cooperative environments. However, despite the plethora of relevant literature on the software agent coalition formation problem, and the existence of similar problems in theoretical computer science, the multirobot coalition formation problem has not been sufficiently grounded for different tasks and task environments. In this paper, comparisons are drawn to highlight the differences between software agents and robotics, and parallel problems from theoretical computer science are identified. This paper further explores robot coalition formation in different practical robotic environments. A heuristic-based coalition formation algorithm from our previous work was extended to operate in precedence ordered cooperative environments. In order to explore coalition formation in competitive environments, the paper also studies the RACHNA system, a market based coalition formation system. Finally, the paper investigates the notion of task preemption for complex multi-robot tasks in random allocation environments. Keywords Multi-robot · Coalition formation · Task allocation
Manuscript submitted on the 1st of September 2006, Lovekesh Vig is a graduate student in the EECS Department at Vanderbilt University, Nashville TN 37212. Julie A. Adams is a Assistant Professor in the EECS Department at Vanderbilt University, Nashville TN 37212. L. Vig (B) · J. A. Adams Electrical Engineering and Computer Science Department, Vanderbilt University, Nashville, TN 37212, USA e-mail:
[email protected]
86
J Intell Robot Syst (2007) 50:85–118
1 Introduction A fundamental problem that any practical multi-robot system must address is task allocation. Previous task allocation schemes make a number of simplifying assumptions about the nature of the tasks to be allocated. One such assumption is that the tasks are indivisible and may be performed by a single robot. However, as the field evolved it became clear that very often situations arise that require multiple robots to cooperate in order to achieve a given task. Consequently, currently there is a need for systems that allow robots to autonomously form teams and cooperatively complete assigned missions. So far researchers have focused primarily on the singletask single-robot (ST-SR) problem [26]. Given the ever increasing complexity of multi-robot tasks, solutions to the single-task multiple-robot (ST-MR) problems will assume greater significance. Formally the multi-robot coalition formation problem may be defined as follows: Definition (Multi-robot Coalition Formation (MRCF)) Given a collection of n robots R and m tasks T. Each robot is equipped with certain sensor and actuator capabilities. A coalition is defined as a collection of multiple robots that combine to form a team. Also given is a characteristic function fc : C, T → that maps coalition-task pairs to numerical values or efficiency ratings. The goal is to find the optimal partitioning of the set of robots into teams such that the subsequent assignment of the teams to tasks results in maximization of the overall performance or utility. This definition of MRCF is somewhat constrained. For example, most task allocation problems are not static, they are dynamic decision problems that vary in time with environmental changes and robot failures. Also a task may be performed in many different ways, i.e. a task may have multiple possible decompositions and hence multiple potential allocations. These questions are addressed with the RACHNA system described in Section 6. The corresponding problem for software agents has received considerable attention from the multi-agent community and is also known as the coalition formation problem. Due to inherent differences between the software agent and robotic domains, mapping algorithms from one domain to the other is not entirely straightforward. The issues involved in multi-robot coalition formation were explored in detail in our previous work [61] and a brief description of these issues is provided in Section 4. This paper extends our previous heuristic-based coalition formation algorithm to allow it to allocate tasks in environments where the tasks have a partial ordering between them. The algorithm is subsequently validated with a set of simulated and real world robotic experiments. Limitations of the heuristic-based approaches led to the conception of the RACHNA system for competitive environments. RACHNA is a market-based coalition formation system [60] that leverages the inherent redundancy in robotic sensory capabilities to allow for a more tractable formulation of the coalition formation problem. RACHNA allows for the valuation of the robot capabilities based on supply and demand. This paper is organized as follows. Section 2 provides related work, Section 4 provides a brief description of the multi-robot coalition formation algorithm for
J Intell Robot Syst (2007) 50:85–118
87
instantaneous assignment. The extension of the algorithm to precedence ordered environments and subsequent experiments with real robots is provided in Section 5. Section 6 describes the RACHNA system for market based multi-robot coalition formation and provides the simulation experiments conducted with the system. Section 7 outlines avenues for future work.
2 Related Work This section provides work relevant to the MRCF from the fields of Distributed Artificial Intelligence (DAI), theoretical computer science and multi-robot systems. Multi-robot researchers often borrow ideas from the multi-agent literature in order to attain effective solutions. However there are irreconcilable differences between the two domains that are important to understand. The similarity of the MRCF problem to other well studied problems in theoretical computer science suggests that existing solutions to these problems may potentially be exploited for coalition formation. 2.1 Multi-agent Coalition Formation Software agents are often required to form teams in order to perform tasks. DAI researchers have focused on constructing agents that are able to cooperate during task execution, thus increasing either individual agent benefits or the systems benefits. Smith [51] discusses the Contract Net Protocol (CNP) in which the agents may attempt to satisfy a task by dividing it into sub-tasks and sub-contract each sub-task to another agent via a bidding mechanism. CNP allocates tasks to single agents and a procedure for task partitioning is necessary. Many CNP inspired Multi-robot Tasks Allocation (MRTA) systems have been proposed in the literature [7, 27, 65]. Gasser [22] focuses on the social aspects of agent knowledge and action in multiagent systems. As in human societies, social mechanisms can dynamically emerge. This approach is effective when agents are interacting in environments where there is no agreed upon, well defined interaction mechanism, or in continuously evolving domains. This approach also bears a strong likeness to swarm based approaches to robotic task allocation in that it allows a system to converge to appropriate robot/task ratios based on local robot interactions [36]. Coalition formation has also been discussed in relation to game theory [55, 64]. In fact, coalition theory is considered a field in its own right. However, game theory does not address the most relevant issues for algorithm design, i.e. the protocols and strategies to be followed by the agents in order to arrive at a beneficial coalition configuration. Instead work in game theory tends to focus on the various notions of stability in N-person cooperative games such as the kernel, core and the bargaining set. DAI researchers have built upon the results provided by game theory in order to design coalition formation algorithms. Sandholm and Lesser [45] developed a coalition formation model for bounded rational agents and presented a general classification of coalition games. The value of a coalition in their model depends on the computation time; however all configurations and possible coalitions are
88
J Intell Robot Syst (2007) 50:85–118
considered when computing a stable solution. Shehory and Kraus [50] proposed algorithms that permit autonomous agents to form coalitions and distribute the joint payoffs to members in Non-super-additive environments. The suggested protocol guarantees that if the agents follow it, a certain stability (kernel-stability) is met. The same paper presented an alternative protocol that offers a weaker form of stability with polynomial running time. However, in both cases, no bound from the optimal is guaranteed. Sandholm et al. [44] formally prove that finding the optimal coalition structure is an NP-Complete problem. They view the coalition structure generation process as a search in a coalition structure graph. The problem is formulated as a search through a subset of the coalition structures and selection of the best coalition structure encountered. Sandholm et al. [44] go on to prove that in order to establish any sort of bound on the quality of the generated coalition structure, it is necessary and sufficient that the search algorithm must search the bottom two levels (or 2a−1 nodes) of the coalition structure graph. The suggested algorithm involves an initial search through these two levels and then is followed by a time bounded breadth first search from the top of the graph. An algorithm for task allocation via coalition formation in Cooperative Distributed Problem Solving (CDPS) environments was proposed by [48]. This algorithm yields tractable solutions by limiting the size of the coalitions and uses a greedy heuristic to yield a coalition structure that is provably within a bound of the best possible solution given the limit on the number of agents. This algorithm is especially relevant to this paper and further modifications to this algorithm are described in [61]. Shehory and Kraus [49] extended this algorithm to enable formation of overlapping coalitions for precedence-ordered task-execution. Coalition formation has begun to receive a great deal of attention from the DAI research community [1, 19, 34, 52]. Despite this plethora of agent coalition formation algorithms, none of these algorithms have been demonstrated to form coalitions in the multi-robot domain. There are numerous reasons for this, some prominent ones being the added communication costs in the multi-robot domain, the unreliability of both communication and robots in real world domains, and the non-transferability of resources between robots. This paper investigates the reasons behind this divide between the agent and robotic domains and offers solutions to some of the problems that arise while tailoring multi-agent coalition formation algorithms to the multirobot domain. Transferability of these algorithms are further discussed in Section 4. 2.2 Parallel Problems The MRCF is a very difficult problem that belongs to the complexity class of strongly NP-hard1 problems. However, coalition formation shares a similar structure with a number of commonly studied problems in theoretical computer science. This section identifies these problems and examines them from a coalition formation perspective.
1 The
complexity class of decision problems that are still NP-hard even when all numbers in the input are bounded by some polynomial in the length of the input.
J Intell Robot Syst (2007) 50:85–118
89
2.2.1 Winner Determination in Combinatorial Auctions Combinatorial auctions are auctions in which bidders can place bids on combinations of items, called packages rather than individual items. Formally, the problem is defined as follows [15]: Definition Let N be a set of bidders and M the set of m distinct items. For every subset S of M let b j(S) be the bid that agent j ∈ N has announced it is willing to pay for S. Let b (S) = max j∈N b j(S). Then the winner determination problem can be formulated as: max
b (S)x S
(1)
S⊂M
s.t.
x S ≤ 1∀i ∈ M.
(2)
Si
x S = 0, 1 ∀i ∈ M.
(3)
The MRCF problem can be cast as a combinatorial auction with the bidders being represented by the tasks, the items by the different robots, and the bids by the utility that each task has to offer for a particular subset of the set of robots (items). Unfortunately, the problem is inapproximable [42], however some empirically strong algorithms exist. Leyton-Brown et al. [33] present a heuristic-based algorithm to efficiently search the space of bids by utilizing a demand based ordering of the different items. Sandholm [42] provides an algorithm that allows auctions to scale up to significantly larger numbers of items and bids by leveraging the fact that the space of bids is sparsely populated in practice. It remains to be seen if such algorithms can be sufficiently decentralized in order to beneficially apply them to a multi-robot setting. A generalized version of the winner determination problem in combinatorial auctions was utilized in the conception of the RACHNA system described in Section 6. 2.2.2 Optimization: Linear Programming The coalition formation problem can also be cast as a 0-1 integer programming problem. Given a set of n agents and m candidate coalition-task pairs, the integer programming problem is cast as follows [47]: Given matrices A and U where: U j = The utility gained when the jth coalition-task pair is selected. aij =
1 if agent i is a part of jth coalition-task pair. 0 otherwise. Maximize
n j=1
Uj x j
(4)
(5)
90
J Intell Robot Syst (2007) 50:85–118
Subject to:
aij x j = 1, i = 1, ..., n
(6)
j=1
where x j ∈ {0, 1} j = 1, ..., m.
(7)
Dantzig [14] introduced the simplex method for solving linear programming problems, which has since become the algorithm of choice for solving linear programs. Although the worst case complexity is exponential in the size of the input [29], the average case complexity for certain classes of problems is polynomial [6], and the method is known to work very well in practice [53]. However, variants of the simplex method appear to be heavily centralized. Consequently, these matrix based approaches appear to have limited potential for distributed applications and to the best of our knowledge, none have been successfully demonstrated in robotic domains. 2.2.3 Job Shop Scheduling Problems Job Shop Scheduling (JSS) problems are characterized as [21]: – – – – –
A Job Set J = { j1 , j2 , . . . , jn }. Machine set M = {m1 , m2 , . . . , mm }. Operations O = {o1 , o2 , . . . , on }, Oi = {oi1 , oi2 , . . . , oimi }. Each operation has a processing time {τi1 , τi2 , . . . , τimi } on a particular processor. On O define A, a binary relation representing a precedence between operations. If v has to be performed before w then (v, w) ∈ A.
The objective of Job Shop Scheduling is to find an optimal schedule such that the net processor time is minimized. The MRCF problem can be cast as a relaxed instance of the JSS problem, with no constraints between jobs (independent job assumption or A = φ). In other words, the order in which the jobs are performed is immaterial, all that matters is the mapping of operations to machines. The incorporation of precedence constraints gives rise to a more difficult problem by introducing a scheduling component to the coalition formation problem (this problem belongs to the class of single-task multiple-robot extended-assignment problems). Coalition formation in task environments involving precedence ordered tasks is explored in Section 5 of this paper. Some scheduling problems allow for preemption, i.e. a job can be interrupted during execution, moved to a different machine, and then resumed. Since robots operate on real world tasks, robotic tasks cannot easily be traded amongst different robots. Therefore, the research community thus far has not given preemption much consideration for MRTA problems. It is our contention that preemption may be useful, if not necessary in certain dire circumstances such as a fire outbreak, chemical leakages etc. The RACHNA system described in Section 6 accommodates task preemption for urgent tasks. Distributed scheduling problems also bear a striking resemblance to the coalition formation problem. However, there does not exist a direct mapping from this problem to the MRCF problem because the problem environment cannot accommodate the additional locational constraints imposed on the resources by a robotic
J Intell Robot Syst (2007) 50:85–118
91
environment (Section 3 provides a detailed account of these constraints). It may be applicable in environments where the tasks are introduced sporadically. A popular algorithm for this type of problem is Dijkstra’s distributed bankers algorithm [17], which works on the principle of incrementally allocating resources to ensure safe states and avoid deadlock. However, while Dijkstra’s algorithm guarantees a finite response time and deadlock avoidance, it does not guarantee a good response time. Extensions to the bankers algorithm have been provided to allow for multiple resource types and to allow for unsafe states [32], however the algorithms applicability to task allocation problems has not been fully explored. Many solutions to the Job Shop Scheduling problem have been proposed. Some solutions view the problem as a constraint satisfaction search [39, 40]. More traditional approaches formulate the problem using integer programming [62], mixed integer programming [18] and dynamic programming [54]. It may be worthwhile to explore the possibility of decentralizing these algorithms without making significant compromises on solution quality. 2.2.4 Set Partitioning and Set Covering Perhaps the closest problems in theoretical computer science to the coalition formation problem are the set partitioning and set covering problems. Determining which of the two problems is more apt for a particular domain depends on whether the task environment allows for overlapping coalitions. Overlapping coalitions are discussed further in Section 5. Balas and Padberg [2] define the Set Partitioning problem as follows:
Definition (Set Partitioning Problem (SPP)) Given a finite set E, a family F of acceptable subsets of E, and a utility function u : F → + , find a maximum-utility family X of elements in F such that X is a partition of E. The coalition formation problem can be cast as an instance of the SPP, with E as the set of robots, F as the set of all feasible coalition task pairs, and u as the utility estimate for each such pair [23]. The SPP problem has been studied in depth and is provably NP-hard. However, numerous heuristic-based SPP algorithms have been proposed in the literature [2, 10, 20], although it remains to be seen whether such heuristic algorithms are applicable to MRTA problems. To this end, a potentially important question is whether and how these algorithms can be parallelized. For environments that allow overlapping coalitions, the set covering problem more closely approximates coalition formation. Balas and Padberg [2] define the set covering problem as follows: Definition (Set Covering Problem (SCP)) Given a finite set E, a family of acceptable subsets F of E and a cost function c : F → + , find a maximum-utility family X of elements in F such that X is a cover of E. The ST-MR problem for time extended assignment can be cast as an instance of the SCP, with E as the set of robots, F as the set of all feasible (and possibly overlapping) coalition-task pairs, and u as the utility estimate for each such pair [23]. Like the SPP, the SCP is also NP hard [30]; however the coalition space of the SCP is
92
J Intell Robot Syst (2007) 50:85–118
far less constrained. Researchers have developed greedy approximation algorithms for the SCP that yield provably good sub-optimal solutions [4, 11]. It is important to note that these heuristic algorithms perform well when the space of feasible subsets is limited, and that they perform poorly in the most general case of the SCP, with all subsets allowed. For purposes of multi-robot task allocation, this result suggests that such algorithms would be best applied in environments in which the space of possible coalitions is naturally limited, as is the case with heterogeneous robots. To the best of our knowledge set covering or set partitioning algorithms have not been applied to MRTA problems, and the fact that [50] successfully adapted and distributed Chvatal’s approximation algorithm [11] suggests that SCP may be viable for MRTA problems. Our previous work [59, 61] explores the applicability of Shehory and Kraus’ algorithm to the MRCF problem.
2.2.5 Dynamic Programming The complexity of the coalition formation problem generally decreases when the algorithm can observe individual coalition values instead of merely coalition structures. Sandholm et al. [44] provide a dynamic programming algorithm for the coalition formation problem when individual coalitions can be evaluated. The algorithm runs in O(3a ) time (significantly less than exhaustive enumeration of all coalitions O(aa )). However, this algorithm requires O(3a ) time even when the space of coalitions is restricted. This condition may not be feasible for task allocation, especially given that a coalition’s value can only be evaluated when it is paired with a task. The above list of parallel problems is by no means exhaustive. Many other problems share a similar problem structure to coalition formation. The objective of reviewing the above problems was to direct the readers attention to the large number of potentially untapped solutions in theoretical computer science that may lead to fruitful algorithms for the coalition formation problem.
2.3 Multi-robot Task Allocation Task allocation is proving to be a challenging problem due to the unpredictable nature of robot environments, sensor failure, robot failure, and dynamically changing task requirements. A number of elegant solutions to the task allocation problem have been proposed. The ALLIANCE [38] architecture uses motivational behaviors to monitor task progress and dynamically reallocate tasks. ALLIANCE employs a variant of the subsumption architecture [8] and makes use of behavior sets to enable a robot to perform versatile tasks. Recently [36] proposed a swarm based approach for the cooperative observation of multiple moving targets (CMOMMT). This scheme mimics ant-behavior to regulate the distribution of sensors in proportion to that of the mobile targets. Dahl et al. [13] present a task allocation scheme based on Vacancy Chains, a social structure modeled on the creation and filling up of vacancies in an organization. The Broadcast of Local Eligibility system (BLE) [63] system uses a Publish/Subscribe method to allocate tasks that are hierarchically distributed. Market based task allocation systems have traditionally found favor with the software-agent research community. The inspiration for these systems stems from the Contract Net Protocol [51]. Variations of the Contract Net Protocol have found
J Intell Robot Syst (2007) 50:85–118
93
applications in numerous software-agent negotiation scenarios [12, 41, 43, 57]. Stentz and Dias [56] were the first to utilize a market-based scheme to coordinate multiple robots for cooperative task completion. This work introduced the methodology of applying market mechanisms to intra-team robot coordination as opposed to competitive inter-agent interactions in domains such as E-commerce. Laengle et al. [31] implemented the KAMARA system that uses a negotiation based task allocation scheme for controlling the different components of a complex robot. Caloud et al. [9] developed the GOPHER architecture that utilizes a centralized auction protocol to allocate tasks with a high level of commitment. Gerkey and Matari´c [25] developed MURDOCH; a completely distributed auction-based task allocation scheme that utilized a Publish/Subscribe communication model. Tasks in MURDOCH are allocated via a single round, first price auction in a greedy fashion. M+, another auction based task allocation protocol was developed by [7]. The novelty of the M+ system lies in that it allows for dynamic task reallocation of subcomponents of complex tasks. Dias [16] designed the Traderbots architecture for multirobot control. Traderbots agents, called traders, are responsible for trading tasks via auctions. When an auction is announced agents compute bids based on their expected profit for the tasks, and the robots that can perform the tasks for the lowest price are awarded contracts. The common underlying factor in all of the above systems is the single robotsingle task assumption that assumes the indivisibility of tasks and that each task may be performed by a single robot. As research in the field matures and multirobot tasks become more and more complex, this assumption is proving to be an oversimplification. Many task domains require that a team of robots work on a task simultaneously, making the task allocation problem far more difficult. A relatively unexplored problem in multi-robot systems is the allocation of multirobot teams to different tasks (the ST-MR problem) commonly known as the MRCF problem. Recently researchers have offered a variety of market based solutions to the (ST-MR) single-task multiple-robot task allocation problem [35, 46, 58, 65]. However, none of these task allocation schemes utilize the inherent redundancy in robot sensory capabilities. The RACHNA system described in Section 6 leverages this redundancy to enable a more tractable formulation of the MRCF. Thus, one of the primary objectives of this paper is to develop generic task allocation schemes for allocating complex multi-robot tasks.
3 Software Agents vs. Robots Largely due to inherent differences between the respective systems under study, the multi-agent and multi-robot research communities have each developed their own methods for perception, reasoning, and action in individual agents/robots. In particular, the multi-robot community has historically studied both explicit and implicit coordination techniques [3, 37]. Implicit coordination techniques employ dynamics of interaction among the robots and the environment in order to achieve the desired collective performance, often in the form of designed emergent behavior. Explicit coordination techniques, in comparison, deal with comparatively more sophisticated agents/robots, and employ intentional communication and collaboration methods much like those employed in multi-agent systems. Therefore at the level
94
J Intell Robot Syst (2007) 50:85–118
Table 1 Software agents vs. robots Software agents
Robots
Are simply fragments of code. Capabilities correspond to code, data. Capabilities vary dynamically. Each different agent is unique. Cooperate by exchanging information. Not concerned with real world issues. Tasks generally involve information gathering.
Are tangible entities that occupy space. Capabilities correspond to sensors, actuators. Capabilities are static. Are manufactured, greater redundancy. Cooperate implicitly and explicitly. Must deal with real world constraints. Tasks may involve manipulation of real objects.
of explicit coordination among multiple individuals, the differences in techniques used by multi-agent and multi-robot systems are in fact very few [24]. Although robotics researchers employ sophisticated techniques while designing single robot control systems, they have tended to use techniques that are already well known in the agent community when designing explicitly coordinated multi-robot systems. Having said that, there are irreconcilable differences between the two domains that are important to understand. The previous sections highlight how the tasks assigned to robotic agents can vary significantly from tasks in software domains. Robotic agents operate in the real world and have to account for issues of interference between robots, obstacle avoidance, etc. Robotic agents also often deal with more restricted resource constraints and are unable to extend their capabilities on the fly. A software agent does not have to be concerned with losing communication capabilities but Multi-robot Systems (MRS) have to deal with more restricted and delayed communication. Failures occur with higher frequency and in a wider variety in robotic systems. Additionally, robotic capabilities do not vary dynamically whereas mobile software agents often extend their capabilities by importing relevant code from another agent. Furthermore, robotic systems have to be able to accommodate larger error bounds in performance since they often deal with faulty sensors and interact with real world environments. Finally, robotic systems often require more creative solutions to recover from faults. Thus, controlling multiple robots can be a significantly different problem when compared to controlling multiple agents. Table 1 lists important differences between the software agent and robotic environments. It should be noted that the variability of robotic capabilities is static relative to agent capabilities. A robot may augment its capabilities on the fly by downloading schemas, or by learning to perform a task more efficiently. Robots may also encounter partial robot failures that would alter their capabilities. However, it is still safe to assume that robotic capability updates will be relatively infrequent as compared to the potential corresponding updates for software agents.
4 Heuristic-based Multi-robot Coalition Formation A multi-robot coalition formation algorithm that leverages a heuristic from the multiagent coalition formation literature was developed [61]. The notion of coalition
J Intell Robot Syst (2007) 50:85–118
95
imbalance and its implications with respect to fault tolerance in multi-robot coalitions is an integral component of the algorithm which is briefly reviewed in this section. Shehory and Kraus [50] designed a heuristic-based algorithm for task allocation via agent coalition formation in Distributed Problem Solving (DPS) environments. The algorithm restricts the coalition structure space by restricting the size of individual coalitions to a fixed constant k. This heuristic fits well with the multi-robot task domain because very often task execution requires less than a fixed number of robots. Due to the differences between agents and robots outlined in Section 3, the current algorithm cannot be directly applied for multiple-robot coalition formation.
4.1 Computation vs. Communication The algorithm by [50] in its original form requires extensive communication and synchronization during the computation of coalition values. While this may be inexpensive for disembodied agents, it is often desirable to minimize communication in multiple-robot domains even at the expense of extra computation. The purpose of communication in Shehory and Kraus’ algorithm is to allow the agents to continuously update each others capability values and to avoid redundant computations. Since robot capabilities typically remain static, the need for continuous updates is eliminated. Therefore, we altered the algorithm to allow each robot to evaluate all coalitions in which it was a member and eliminated the need for communication in the coalition evaluation stage. Generally speaking, communication is more expensive and unreliable for robots than for agents. The impact of extensive communication for Shehory and Kraus’ algorithm was shown to be severe enough to endorse relinquishing communication in favor of additional computation when possible.
4.2 Task Format The multi-agent coalition formation algorithm [50] assumes that the agents have a capability vector, < b i1 , ..., b ri >. Agents are allowed to exchange resources; therefore an agent coalition is deemed to have sufficient resources to perform a task if each element of the vector sum of the agent capabilities is greater than the vector of capability requirements for the task. Shehory and Kraus’ algorithm assumes that the individual agents’ resources are collectively available upon coalition formation. The formed coalition freely redistributes resources amongst the members. However, this is not possible in a multiple-robot domain. Robot capabilities correspond to sensors (camera, laser, sonar, or bumper) and actuators (wheels or gripper) that cannot be autonomously exchanged. This implies that simply possessing adequate resources does not necessarily create a multiple-robot coalition that can perform a task, other locational constraints have to be represented and met. These constraints were represented using a task constraint graph and the coalitions were validated using arc-consistency to check for constraint violations. The Constraint Satisfaction Problem (CSP) formulation checks each candidate coalition to verify if its coalition is feasible. After constraint checking, fewer coalitions remain for further evaluation. While additional overhead is incurred during constraint checking, this overhead is somewhat compensated for by the reduced number of coalitions.
96
J Intell Robot Syst (2007) 50:85–118
4.3 Coalition Imbalance Coalition imbalance or lopsidedness is defined as the degree of unevenness of resource contributions made by individual members to the coalition. This characteristic is not considered in other coalition formation algorithms. A coalition where one or more agents have a predominant share of the capabilities may have the same utility as a coalition with evenly distributed capabilities. Recall from Section 3 that robots are typically unable to redistribute their resources. Therefore coalitions with one or more dominating members (resource contributors) tend to be heavily dependent on those members for task execution. These dominating members then become indispensable. Such coalitions should be avoided in order to improve fault tolerance. Over reliance on dominating members can cause task execution to fail or considerably degrade. If a robot is not a dominating member (i.e. does not possess many sensors) then it is more likely that another robot with similar capabilities can replace this robot. The Balance Coefficient metric [59] quantifies the coalition imbalance level. To accommodate coalition of different sizes, a new metric called the FTC was devised that subsumed elements of both balance and size [61]. 4.4 The Algorithm Our multi-robot coalition formation algorithm leverages the heuristic by [50] to obtain a more tractable space of coalition structures. The multi-robot coalition formation algorithm is iterative and a task is allocated at each iteration. Within each iteration, the algorithm proceeds in two stages: 1. All possible coalitions are distributively calculated and the initial coalition values are computed. 2. Agents agree on the preferred coalitions and form them. Stage 1-Preliminary coalition evaluation: Initially each agent Ai has a list of agents A and a list of coalitions Clist in which Ai is a member. Ai performs the following steps for each coalition C in Clist : 1. Calculate the coalitional capabilities vector (Bc ) by summing the capabilities of the coalition members. Formally, BC = Ai ∈C BAi . 2. Form a list (Ec ) of the expected outcomes of the tasks in set T S when coalition C performs those tasks. For each task t j ∈ T S: (a) Determine the necessary capabilities for task t j. (b) Compare t j’s capability vector Bj to the sum of the coalition capabilities Bc . t (c) If ∀ i, b i j ≤ b iC then utilize the CSP formulation (Section 4.2) to verify the locational sensor constraints for the coalition members. (d) If the constraints are met, then calculate t j’s expected net outcome (e j) with respect to C by subtracting the cost of unutilized resources from the net task-value. This is the expected net outcome (e j) of the coalition-task pair < C, t j >. Place < e j, C, t j > into Ec . (e) Choose the highest valued coalition-task pair from Ec and place it in a set HCT of highest valued coalition-task pairs. At the end of Stage 1, each agent has a list of coalition-task pairs and coalition values.
J Intell Robot Syst (2007) 50:85–118
97
Stage 2-Final coalition formation: Each agent (Ai ) iteratively performs the following: 1. 2. 3. 4. 5. 6. 7. 8.
Locate in HCT the coalition-task pair < Cmax , tmax > with the highest value emax . Retain in HCT all coalition-task pairs with values within a bound (5%) of emax . Calculate the FTC for all coalition-task pairs in HCT . Broadcast the coalition-task pair with the highest FTC, < C FTC , t FTC > along with the coalition value e FTC . Choose the coalition, task pair < Chigh , thigh > with the highest value ehigh from all broadcasted coalition pairs. If Ai is a member of coalition Chigh , join Chigh , return. Delete from Clist coalitions containing members of Chigh . Delete the chosen task thigh from T S.
The above steps are repeated until all the agents are deleted, until there are no more tasks to allocate, or none of the possible coalitions is beneficial. Simulation experiments were performed in the Player/Stage [28] simulation environment to demonstrate the effect that the balance coefficient would have on the fault tolerance of robot teams. The algorithm was then transferred to a set of pioneer robots on a set of real world tasks. Experiments were conducted with foraging, boxpushing and sentry duty tasks [61] in order to show that the algorithm operates independent of task methodology and task diversity. The translation of the simulation experiments to real world robot experiments required a few adjustments to account for problems of slippage and mapping of capabilities to real numbers. However, the experiments successfully establish that the algorithm performs satisfactorily with real robots. The algorithm was successfully deployed to allocate a set of diverse multirobot tasks.
5 Overlapping Coalitions for Precedence Ordered Tasks Our initial algorithm provides a heuristic-based multi-robot coalition formation algorithm. The described algorithm iteratively assigns each coalition to an independent task that the coalition is responsible for executing. This section examines the multirobot coalition formation problem for the special case of precedence ordered tasks or tasks that have a temporal partial ordering between them. The domain introduces new complexities to the coalition formation problem due to the existence of interdependencies between the various tasks. An example where precedence ordered tasks would be of practical interest is the blocks world. Consider the task of arranging a set of blocks starting from an initial configuration into a final configuration as shown in Fig. 1. In this example, blocks A1 and B1 have to be arranged into their final position before block D1 can be placed on top of them. Similarly D1 and D2 must be arranged prior to the arrangement of D3 . Thus the order of task execution must be consistent with: t D2 , t D1 t D3 . Assume that coalitions of robots have been assigned the tasks of arranging the different blocks. Thus a robot X may be a member of all three coalitions responsible for the arrangement of D1 , D2 and D3 in Fig. 1 since these tasks have to be executed in order. In other words, there may be coalitions whose members overlap. However, closer examination reveals that the twin tasks of arranging blocks D1 and D2 are independent of each other, hence they may be executed in parallel or in any random
98
J Intell Robot Syst (2007) 50:85–118
Fig. 1 A blocks world (Shehory and Kraus 1998) precedence ordered task
order. Therefore, it would be more efficient to assign disjoint coalitions to these tasks to allow for parallel arrangement of D1 and D2 . This idea can be generalized into a sequence of precedence ordered tasks where some intermittent sub-sequences may consist of independent tasks. Efficient solutions in this case would execute the tasks in such a subsequence in parallel by assigning disjoint coalitions to as many tasks in the subsequence as possible. Subsequences of tasks for which interdependencies exist may be performed using overlapping coalitions. 5.1 Structure of Tasks A high level planner, such as GraphPlan [5], may be used to develop the overall partial ordering of the tasks. From this plan, a precedence order graph can be constructed, with each node of the graph representing a task and edges representing dependencies between tasks. Figure 2 provides a precedence graph for a set of tasks where tasks t1 and t2 have no outstanding dependencies, whereas task t3 may be performed only after t1 and t2 are completed. Each task has a utility associated with the task. The utility of a task ti will depend on: 1. The number of tasks dependent on the completion of ti . 2. The resource requirements for ti .
J Intell Robot Syst (2007) 50:85–118
99
Fig. 2 Precedence order graph for a set of tasks. Each task independently has identical utility (u) and the utilities are propagated backwards from the leaves with a discount factor α
Formally, the utility of each task is evaluated by calculating the resources it consumes in addition to the utility of dependent tasks, propagated backwards from the leaf nodes and weighed by a discount factor α (0 < α < 1). The utilities from two branches are summed at the intersecting node. Thus, tasks that have a greater number of immediately dependent tasks are assigned a higher utility. For example, in the precedence graph in Fig. 2, assuming that all independent tasks have an identical task utility (u), then in the precedence ordered graph task T1 will have a higher utility (u + 3uα) than task T2 (u + 2uα + α 2 u) even though there are three tasks dependent on both T2 and T1 . 5.2 The Precedence Ordered Coalition Formation Algorithm The idea is to find the largest subset of tasks consistent with the task ordering that can be executed with the currently available set of robots. This is achieved by continuously updating the set of executable tasks and free agents to perform those tasks, to extend the coalition formation algorithm outlined in Section 4.4. Formally the precedence ordered algorithm is: Initialization: Each agent stores in memory the precedence order graph generated by a high level planner. The agents extract all tasks that have no unfulfilled prerequisite tasks. Call this list of candidate tasks the candidate list, Tcand . Each agent also maintains a list of free agents, Fa , representing agents not currently engaged in performing a task. 1. Coalitions are formed from the agents in Fa for performing the tasks in Tcand using the coalition formation algorithm from Section 4. Agents that are assigned to a task are removed from Fa . Similarly, allocated tasks are removed from Tcand . 2. Upon completion of a task t j by a coalition C, the lowest numbered member Ai of coalition C broadcasts the coalition task pair < t j, C >. 3. Upon receipt of a task completion message, < t j, C >, each agent performs the following: (a) Add to the list of free agents, Fa , the members of the coalition C that completed task t j.
100
J Intell Robot Syst (2007) 50:85–118
(b) Check the precedence graph and include in Tcand any new tasks that no longer have outstanding dependencies after the execution of t j. The above steps are repeated until there are no more tasks remaining to be executed in the precedence order graph. 5.3 Precedence Ordered Tasks Experiments Section 5.2 presented an algorithm to extend the application of the multi-robot coalition formation algorithm to domains where tasks have a partial ordering between them. This section presents experiments, both in simulation and the real world, demonstrating the successful application of the algorithm. The tasks employed for these experiments are the box-pushing and patrol tasks. The box-pushing task involves a coalition of two robots navigating through the environment and pushing a box in tandem. The participating robots must be equipped with laser range finders, be mobile, and have bumpers. The patrol task involves a pair of patrol robots navigating to and exploring a room. In order to accomplish the task, the robots must be equipped with laser range finders, a map of the environment, and sensors to detect contaminants.
Fig. 3 A simulated urban indoor task environment
J Intell Robot Syst (2007) 50:85–118
101
5.3.1 Simulation Experiment The environment is a mapped indoor urban building with rooms and corridors. Boxes are placed at specific locations in the building to block access to rooms or other corridors, as shown in the simulated environment in Fig. 3. Thus there are two types of tasks to be performed by the robots in this environment, namely box-pushing and patrol. Eight robots were simulated for this experiment. The four pusher robots (robots R1 –R4 in Fig. 3) were equipped with bumpers, laser range finders, and could communicate with each other. The four patrol robots (robots R5 –R8 in Fig. 3) did not have bumpers but had simulated sensors that could be used to detect contaminants. All robots had a map of the environment. The utility for all independent box-pushing and patrol tasks was identical. Thus, the overall utility of a task T in the precedence order graph depended largely on how many tasks were dependent on T. The precedence order graph for the given set of tasks depicted in Fig. 3 is provided in Fig. 4. The precedence ordered MRCF algorithm was utilized to allocate and perform tasks dynamically. The robots formed coalitions on the fly to perform tasks as they became eligible for execution. Figure 5 demonstrates the various box-pushing and patrol tasks being performed by the different coalitions according to the precedence order graph in Fig. 4. Figure 5a shows Robots R1 and R2 performing task T1 . Completion of task T1 unblocks tasks T2 , T3 and T4 ; however since only two box-pushing tasks can be performed at a time, tasks T2 and T4 are allocated coalitions {R3 , R4 } and {R1 , R2 } as shown in Fig. 5b. Once, T2 and T4 are executed, tasks T5 , T6 , T7 , T8 , and T10 become unblocked. Figure 5c shows the robot coalitions navigating to perform tasks T3 , T5 , and T10 . Task T3 is being performed in Fig. 5d by coalition {R1 , R2 }, thereby unblocking task T12 . Figure 5e shows T12 being performed by a coalition {R5 , R8 } of patrol robots and T5 being performed by a coalition of pushers {R3 , R4 }. Figure 5f shows tasks T6 and T10 being performed by coalitions of box-pushers {R1 , R2 } and patrol robots {R6 , R7 } respectively. Tasks T7 , T8 and T11 are shown performed in Fig. 5g by coalitions {R1 , R2 }, {R3 , R4 } and {R6 , R7 } respectively. Finally, T13 and T14 are performed by coalitions {R6 , R7 } and {R5 , R8 } in Fig. 5h.
Fig. 4 The precedence order graph for the tasks in Fig. 3
102
J Intell Robot Syst (2007) 50:85–118
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 5 Application of the precedence ordered MRCF algorithm to search the indoor urban environment
J Intell Robot Syst (2007) 50:85–118
103
5.3.2 Real Robot Experiments Two real robot experiments were performed, in the first experiment a building corridor was mapped and boxes were placed, as in the simulation environment as shown in Fig. 6a. The environment contained two rooms that were to be patrolled. The rooms were connected by a corridor and two boxes blocking each room. The task precedence graph is provided in Fig. 6b. Four robots, two patrol robots, and two pusher robots were employed. The coalitions were formed to perform each task as a task became eligible for execution, as demonstrated in Figs. 7a–7f. Figure 7a shows the robots at their starting positions. Initially two pusher robots coalesce to push a box and unblock a room (task T3 performed in Fig. 7b), which prompts the patrol robots to coalesce and explore the unblocked room (task T4 performed in Figs. 7c and 7d). The pushers meanwhile unblock a second room (task T2 performed in Fig. 7e) and the patrol robots then visit and cover the second unblocked room (task T1 performed in Fig. 7f). All robots had a map of the environment and utilized a Monte Carlo localization algorithm to determine their individual positions in the environment. The second experiment involved forming coalitions to perform three box-pushing tasks. The objective was to manipulate the boxes so that they form a T-shape. Figures 8a–8b show the initial and desired final configuration of the boxes. The third task was dependent on the successful execution of the first two independent tasks as shown in Fig. 8c. Figure 9a shows the robots at their starting positions. Figure 9b shows the two tasks T1 and T2 being performed by two coalitions of pushers. Figures 9c shows a coalition navigating to perform T3 . Finally, T3 is performed by the same coalition that performed task T1 in Fig. 9d. Robotic domains frequently involve temporal ordering between tasks, the above experiments demonstrate that the extended algorithm can be applied to form overlapping coalitions to perform these tasks on the fly, as they become eligible for execution.
Fig. 6 a The urban indoor task environment (Real Robot Task). b Precedence order graph for the task environment
(a)
(b)
104
J Intell Robot Syst (2007) 50:85–118
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 7 a Robots at the staring position. b Pushers coalesce to push a box and unblock a room. c Patroller robots coalesce to explore first unblocked room. d Pushers push a box to unblock the second room. e Coalition of robots patrolling the first room. f Patrollers visit second unblocked room
J Intell Robot Syst (2007) 50:85–118
(a)
105
(b)
(c)
Fig. 8 Initial a and final b configuration of boxes to form a T-shape (Real Robot Task). c Precedence order graph for the task environment
5.4 Limitations The heuristic-based coalition formation techniques described thus far suffer from some limitations. The robot sensors and actuators are assigned a static initial value instead of being valued dynamically based upon demand and supply. Thus, a sensor may be have a high cost despite having very few tasks (or none) competing for it. Consequently, the overall solution cost would increase resulting in relatively inefficient allocations. Heuristic based algorithms work well when the size of coalitions is limited; however when the tasks require coalitions of a larger size the solution can be arbitrarily bad. Particularly in the case of the algorithm described in Section 4.4, the solution quality would significantly decrease if the tasks frequently require coalitions whose size exceeds the maximum allowed coalition size. Also, the algorithm does not take into account the redundancy in robotic sensory capabilities that can make the problem more tractable. The RACHNA system was designed for competitive and random allocation environments in light of these limitations.
6 The RACHNA System: Market-based Coalition Formation This Section describes RACHNA,2 a market based system for coalition formation in competitive environments. RACHNA alleviates some of the problem associated with heuristic-based coalition formation by allowing tasks to compete for robot resources and autonomously adjusting sensor values based on supply and demand. Also, unlike the heuristic-based algorithms, RACHNA does not impose any restrictions on coalition size. 6.1 Conception As mentioned in Section 2 market-based task allocation techniques have gained a lot of popularity in the multi-robot community due to their inherently distributed
2 Robot
Allocation through Coalitions using Heterogeneous Negotiating Agents.
106
J Intell Robot Syst (2007) 50:85–118
(a)
(b)
(c)
(d)
Fig. 9 a Robots at the starting positions. b Robots coalesce to perform the two independent boxpushing tasks. c Two robots then form a coalition to perform the dependent box-pushing task. d Dependent task performed
protocols and the overall efficiency generated due to competition between robots. A common feature of the market based systems, discussed in Section 2, is that all these systems require the robots to bid on the tasks. The bidding process is central to determining the auction outcome. Therefore when dealing with complex tasks, the bidder should have a global view of the available resources. We propose a system in which the bidding is reversed. The auction is performed by the tasks for the individual robot services. This allows for the bidding to be performed with the global information necessary for coalition formation. One of the most prominent differences between the multi-agent and multi-robot domains (Section 3) is the level of redundancy in multi-robot and software-agent capabilities. Whereas software-agents are simply code fragments programmed by individuals, robots are manufactured on a large scale. Therefore, robots are more likely to have greater redundancy in their sensor/actuator capabilities. Indeed, almost any modern robotics facility will have a number of robots with identical
J Intell Robot Syst (2007) 50:85–118
107
capabilities. To the best of our knowledge, RACHNA is the first system to leverage this redundancy in order to enable a more tractable formulation of the coalition formation problem. RACHNA achieves this through the formulation of the coalition formation as a multi-unit combinatorial auction.3 While the use of single good auctions allows the bidders to bid on only one good, combinatorial auctions permit bidding on combinations of goods. This work focuses on a particular type of combinatorial auction called multi-unit combinatorial auction. Definition The auctioneer has a set of items, M = 1, 2,..., m to sell. The auctioneer has some number of each item available: U = {u1 , u2 , ..., um }, ui ∈ Z +. The buyers submit a set of bids, B = {B1 , B2 , ..., Bn }. A bid is a tuple Bj = (γ j1 , ..., γ jm ), pj , where γ jk ≥ 0 is the number of units of item k that the bid requests, and pj is the price. The Binary Multi-unit Combinatorial Auction Winner Determination Problem (BMUCAWDP) is the problem of labeling the bids as winning or losing so as to maximize the auctioneers revenue under the constraint that each unit of an item can be allocated to at most one bidder: max
pj x j s.t.
n
γ ji x j ≤ ui , i = 1, 2, . . . , m.
(8)
j=1
The MRCF problem can be cast as a combinatorial auction with the bidders being represented by the tasks, the items as the different types of robots, and the price as the utility that each task has to offer. Unfortunately, the BMUCAWDP problem is inapproximable [42]; however some empirically strong algorithms do exist [33, 42]. It remains to be seen if such algorithms can be decentralized sufficiently to apply them beneficially to a multi-robot setting. 6.2 The Architecture The RACHNA system, reverses the bidding. The auction is performed by the tasks for the individual robot services. This allows for the bidding to be performed with the global information necessary for coalition formation. There are two types of software agents that are involved in the task allocation: 1. Service Agents: The Service Agents are the mediator agents through which the tasks must bid for a service. RACHNA requires that each robot has a set of services or roles that it is capable of performing. The roles are determined by the individual sensor and behavioral capabilities resident on each robot. There is one service agent for each service type that a robot can provide. A service agent may communicate with any of the robots that provide the particular service to which the agent corresponds. For example, the foraging service agent may communicate with all robots that currently have sensor capabilities (i.e. camera and gripper) to perform the foraging service. Service agents reside on any one of the robots that are capable of providing the service. Thus, the global information
3 There
may be subtle unavoidable differences in robots with seemingly identical sensory capabilities due to wear and tear, loose wiring, sensor accuracy etc. These are ignored for the time being.
108
J Intell Robot Syst (2007) 50:85–118
concerning the task is acquired in a decentralized manner through the use of service agents. 2. Task Agents: Task Agents place offers on behalf of the tasks so as to acquire the necessary services. The task agents communicate only with the service agents during negotiations. Once the task has been allocated, the task agent may communicate directly with the robots that have been allocated to the task. Task agents reside on any remote machine with connectivity to the service agents. Figure 10 provides an overview of an example RACHNA architecture implementation with four service agents (Foraging, Pusher, Watcher, Mapper), N robots (R1 , R2 , ..., R N ) and a Task Agent bidding on the services. Each service has certain sensor requirements and only communicates with the relevant robots for purposes of allocation. An economy is proposed where the tasks are represented by task-agents that are bidding for the services of the individual robots. The economy has a set of robots R1 , R2 ..., R N , where each robot is equipped with sensor capabilities that enable it to perform various services such as pushing, watching, foraging, etc. The tasks are assumed to be decomposable into the sub-task behaviors (roles) that they require. For example, in the box-pushing task as defined by [26], two pusher sub-task roles are required and one watcher sub-task role is required. Each role is represented by a service agent that is responsible for negotiating with the robots that have the desired capability. The roles may be implemented through the use of behavior sets [38]. The bids are relatively sparse compared to the overall space of coalitions and will yield a more tractable formulation of the MRCF. Also, unlike the heuristic-based algorithms for coalition formation (Sections 4 and 5), no restriction is placed on the size of the desired coalitions.
Fig. 10 An example RACHNA Implementation
J Intell Robot Syst (2007) 50:85–118
109
6.3 Robot Failure If a robot loses control of a sensor or actuator in RACHNA, the system allows for graceful performance degradation. Since there is a mapping from sensor capabilities to behavioral capabilities, if a sensor failure occurs a robot may still be capable of performing an alternative behavior. Consider a robot that is capable of performing the foraging and watcher behaviors. If the robot’s gripper is damaged, it will be unable to execute the foraging behavior but it may still be able to perform the watcher behavior. The foraging service agent system simply deletes the robot from the list of foragers and in future auctions this robot will not receive offers relating to the foraging behavior. The robot will remain in the list of watchers because the relevant sensors for watching (camera) are still intact. Thus, the system allows for graceful performance degradation. It should be noted that the current system makes no attempt at fault detection. 6.4 The Allocation Environments This work has incorporated three different types of tasks: 1. Urgent: Urgent tasks are allowed to pre-empt an ongoing standard task and generally have a higher average reward per robot. These tasks require immediate attention such as emergency tasks that include fire extinguishing, rescue tasks, etc. 2. Standard: Standard tasks are allocated only when there are sufficient free resources and when the utility of these tasks is sufficient to merit allocation. These tasks may be pre-empted by urgent tasks. Loosely coupled tasks like foraging or tasks that may easily be resumed may fall into this category. 3. Non pre-emtible: Non-preepmptible tasks are allocated similarly to standard tasks but they cannot be pre-empted. Tightly coupled tasks fall into this category since once a tightly coupled task has been initiated, pre-emption would completely debilitate task performance. We consider two different types of allocations in our system: 1. Instantaneous Allocation: This scenario involves the introduction of a number of tasks into the environment and the algorithm must allocate resources to the optimal set of tasks. 2. Random Allocation: This scenario involves introduction of a single urgent task that requires immediate attention. The urgent task is allowed to tempt the robots into working for itself by offering a higher rewards. 6.4.1 Instantaneous Allocation Instantaneous assignment utilizes multiple round auctions in order to allocate multiple tasks. The system is responsible for allocating resources to the tasks such that the overall utility is maximized. Recall that services correspond to goods, robots correspond to units of a particular good (service), and task offers correspond to bids. RACHNA adapts the MRCF problem to a distributed environment to allow the system to leverage the inherent redundancy in robotic sensory capabilities, thereby creating a more tractable formulation of the MRCF. The algorithm proceeds as
110
J Intell Robot Syst (2007) 50:85–118
follows, the auction begins with each task agent sending request messages to the individual service agents. The service agents attempt to obtain the minimum possible price for the requested services by evaluating the minimum salaries that the robots are currently earning. The agents then attempt to lure the robots to the new task by and adding a minimum increment value to their current salaries. The service agents then send this information to the task agents. The task agents determine if they have sufficient utility to purchase the required services. If this is the case, then the services (robots) are temporarily assigned to the task. This kind of offer-counteroffer proceeds in a round robin fashion with the salaries of the robots increasing at every step until no service (robot) changes hands during the round. At this point the final stable solution is reached. The formal algorithm is provided in Section 6.5. 6.4.2 Random Allocation Random Allocation involves the random introduction of urgent tasks into the environment and the tasks are allocated robot services according to a negotiation or bargaining process between tasks. The negotiation proceeds as follows, the new task submits a request to the required service agents for a certain number of services of that type. The service agents take into account the current salaries of the robots and allow for a bargaining process to ensue between tasks. Tasks compete with each other by increasing robot salaries until either the new task successfully acquires robots from other tasks, or waits until more resources are freed. 6.5 The Algorithm Initially, each Service Agent SAi maintains a set of all possible robots, Rob otsi that are capable of performing that particular service. Additionally, SAi is aware of all possible services a robot in Rob otsi can perform and keeps track of which service it is currently performing. All robot salaries are initialized to zero. In order to submit a successful bid, a task agent must place an offer so as to increase the current salaries of the robots by a minimum increment, εc . Each task agent has a fixed utility, U c that is used for bidding on the services. Whenever a task discovers that it can no longer place a successful bid, it relinquishes its robots and decreases their salary by an amount εc . 6.5.1 Preprocessing – –
Initially all task agents submit bids to all relevant service agents. Each service agent SAi then evaluates the following heuristic: scorei =
numbidsi .q(i) avgunitsi
(9)
where, numbidsi is the total number of task agents bidding for service provided by SAi , qi is the number of robots capable of performing service SAi , avgunitsi is the average number of total units requested by these task agents. Once, scorei is evaluated for each service agent SAi , this score is broadcast to all the service agents and the service agents order themselves according to this score.
J Intell Robot Syst (2007) 50:85–118
111
Awarding of Services Upon receipt of a bid bj from Task Agent TAk requesting m robots capable of performing SAi ’s service, SAi performs the following steps: – –
–
If m > |Robotsi | ignore the bid (i.e. not enough robots). Evaluate Sm , the sum of the current m lowest salaries of robots in |Robotsi | that are not already awarded to TAk by a higher ranked service agent (according to the last received broadcast). If Sm + mc ≤ bj 1. 2.
– –
Temporarily award the selected robots to the task. Send a message to the purchased robots to increment their salaries by εc and to change their current task.
Else continue. Upon receipt of a disown signal from a task agent TAk , a Service Agent SAi does the following: 1. 2.
Search Robotsi for any robots that are currently assigned to TAk . Decrement the salaries of the robots by εc and send a message to those robots.
Whenever a robot receives a message for a change in salary from a service agent, the robot sends a message to all connected service agents informing them of its new payoff, its new task, and the service it is currently required to provide. Bidding by Task Agents 1. Each task agent TAk receives periodic broadcasts from each service agent SAi informing it of the current salaries and tasks of the robots performing SAi ’s service. 2. If TAk has already been awarded all the required services, continue. 3. Compute the net utility, UReq , required to place an acceptable offer to each service agent in order to purchase the required services (by increasing the salaries of the currently lowest paid robots by εc ). 4. If UReq < U Curr –
Place the offers to the service agents.
5. Else –
Send disown message to all service agents that have robots currently assigned to task TAk .
6. Upon receipt of the terminate message from all service agents, the task has been awarded the requested robots. If no message is broadcast by a service agent for a fixed period of time, Tmax , the auction is deemed closed and the allocation process is stopped. The messaging protocol for the algorithm is depicted in Table 2. 6.6 Experiments Preliminary experiments were conducted by simulating the RACHNA system on a single computer. Experiment 1 recorded a comparative analysis of the RACHNA
112
J Intell Robot Syst (2007) 50:85–118
Table 2 RACHNA message types No.
Sender
Receiver
Type
Content
1 2 3 4 5
SA TA TA SA Robot
All SA SA Robot SA
Broadcast Unicast Unicast Multicast Multicast
Current salaries, tasks for all service robots Bid for service Disown Increment, decrement Update salary, task, service
system to simple task allocation techniques. A set of real world tasks were simulated in the Player/Stage environment to demonstrate task preemption in Experiment 2. 6.6.1 Utility Comparison The solution quality produced by the RACHNA algorithm was compared to that obtained by simple task allocation schemes. Figure 11 provides the comparison between the solution quality produced by the RACHNA system to those produced by the global greedy and random allocation algorithms. Twenty trials were run for each experiment. The allocation produced by an algorithm by [33] that utilizes a variant of A* search is also provided. This algorithm yields solutions that are either optimal or very close to optimal and therefore provides a reasonable upper bound on solution quality. The graph demonstrates that RACHNA’s solution quality is still sub-optimal, but in a distributed approach such as the one suggested, this is inevitable. As is evident from the figure, RACHNA easily outperforms both the
Fig. 11 Comparison of greedy, random, and A* allocations to RACHNA
J Intell Robot Syst (2007) 50:85–118
113
Table 3 Different services and corresponding sensor requirements Services
Capabilities
Foraging Pushing Object tracking Sentry-duty
Robots
LRF
Camera
Bumper
Gripper
Sonar
0 1 0 1
1 0 1 0
0 1 0 0
1 0 0 0
1 0 1 0
R1 , R2 R3 , R4 , R6 , R7 R1 , R2 , R5 , R8 R3 , R4 , R6 , R7 , R9 , R10
greedy and random allocation algorithms. The reason is that unlike greedy or random search, RACHNA refines the solution in each auction round to include better tasks (bids). 6.6.2 Preemption Experiments The Player/Stage environment was utilized for the set of experiments focused on task preemption in order to formulate a set of real world tasks. The experiments involved a set of four services and ten heterogeneous robots, as shown in Table 3. The following services were employed: 1. Foraging: This service involved robots scouting the environment for objects of interest and retrieving them to a goal destination. 2. Pushing: This service involved robots moving into position and pushing a box. 3. Object-tracking: This service involved robots keeping mobile items of interest under surveillance. 4. Sentry-duty: This service required the robots to monitor a particular area for motion. The sensor capabilities of each robot determines which services it can perform. For example Robots 1 and 2 have Camera, Gripper, and Sonar sensors and can perform Foraging and Object Tracking tasks. The experiment involved four heterogeneous tasks as outlined in Table 4. For example, Task 1 required two foraging robots and one sentry-duty robot. While Tasks 1 and 3 were Standard tasks (Section 6.4), Task 2 was Non-preemptible, and Task 4 was Urgent. Initially the first three tasks were introduced using the algorithm described in Section 6.5. Figure 12a shows the simulation of the first three tasks. Subsequently, Task 4 was introduced into the environment and the bargaining process resulted in Task 3 being preempted, as shown in Fig. 12b. Table 4 Different tasks and corresponding service requirements Tasks
1 2 3 4
Services
Priority
Foraging
Pushing
Object-tracking
Sentry-duty
2 0 0 0
0 2 1 2
0 0 2 1
1 1 0 0
Standard Non-premptible Standard Urgent
114
J Intell Robot Syst (2007) 50:85–118
(a)
(b)
Fig. 12 a Allocation of Tasks 1, 2 and 3 prior to introduction of Task 4. b Allocation of Task 4 and pre-emption of Task 3
The salary increments and the associated task for each robot after each auction round is shown in Table 5. An auction round involves bidding by all tasks at least once. Prior to introducing Task 4 in round 1, the robots are allocated to Tasks 1, 2, and 3 for minimum salaries. After round 2, Task 4 has temporarily acquired R1 , R3 , and R7 by offering a higher salary to each. As the competition increases for the pusher and object tracker services, all robots possessing these resources draw higher salaries with each successive auction round. Auction round 8 involves Task 3 exhausting its utility and being forced to relinquish R7 and R8 . The salaries of these robots are lowered by the minimum increment (εc = 5). Eventually, Task 4 acquires robots R6 and R8 from Task 3 at the end of round 8.
Table 5 Salary increments for different robots during the task allocation Robot
R1 (F, OT) R2 (F, OT) R3 (P, SD) R4 (P, SD) R5 (OT) R6 (P, SD) R7 (P, SD) R8 (OT) R9 (SD) R10 (SD)
Auction round 1
2
3
4
5
6
7
8
5, T1 5, T1 5, T2 5, T2 5, T3 5, T3 – 5, T2 5, T1 5, T2
10, T4 5, T1 10, T4 5, T2 5, T3 5, T3 5, T4 5, T2 5, T1 5, T2
15, T1 10, T4 10, T4 10, T3 5, T3 10, T2 5, T4 5, T3 5, T1 5, T2
15, T1 15, T1 10, T4 15, T4 10, T4 10, T2 10, T2 10, T4 5, T1 5, T1
15, T1 15, T1 15, T3 15, T4 15, T3 15, T4 10, T2 10, T4 5, T1 5, T1
15, T1 15, T1 20, T2 15, T4 15, T3 15, T4 10, T2 15, T3 5, T1 5, T1
15, T1 15, T1 20, T2 20, T2 20, T4 15, T4 15, T3 15, T3 5, T1 5, T1
15, T1 15, T1 20, T2 20, T2 20, T4 15, T4 15, T4 10, – 5, T1 5, T1
J Intell Robot Syst (2007) 50:85–118
115
6.7 Discussion The RACHNA system was designed to overcome some of the limitations of the heuristic-based approaches to coalition formation. Unlike the heuristic-based approaches RACHNA does not assign a predetermined value to the robot sensors and allows sensor valuation based on supply and demand. Also, while most heuristicbased approaches operate by imposing restrictions on the coalition space (for example by limiting coalition size), RACHNA does not impose any restrictions on coalition size. However the RACHNA system does suffer from some limitations, foremost of which is the additional communication overhead as compared to the heuristic-based approach. Currently, the RACHNA system has been implemented in simulation, and pending implementation with real world robots, the system cannot be completely validated. The scalability of the messaging protocol in a distributed environment has to be evaluated and an empirical study of communication load with an increasing number of tasks needs to be conducted. Message synchronization is another issue that may require additional attention in a real world environment to ensure that robots send each other meaningful messages and are able to detect communication failure. RACHNA is also designed to handle partial robot failure in an efficient manner by simply allowing robots to disconnect themselves from the lost services. However, experiments need to be conducted to demonstrate RACHNA’s ability to handle partial and complete robot failure. An associated limitation is that the system makes no attempt at fault detection. The RACHNA system generally yields high quality sub-optimal solutions however the system is highly sensitive to parameters of salary increment and robot diversity. Also as compared to heuristic search based techniques, the system takes longer to converge to a solution. However, the system is more decentralized and allows for dynamic valuation of robot capabilities. There are inherent tradeoffs between a market-based and search based coalition formation techniques. Users choosing between these techniques must be aware of these issues and be able to identify the key issues for a particular application. 7 Summary and Future Directions Finding the optimal multi-robot coalition for a task is an extremely important problem that has received relatively little attention. However, solutions to similar unsolved problems from theoretical computer science may be tailored to yield solutions to the coalition formation problem. This paper identifies such parallel problems and discusses their applicability to coalition formation. In order to draw on the considerable work done in software agent coalition formation, this paper also highlights the difference between software agent and robotic agents. A heuristicbased multi-robot coalition formation algorithm from our previous work was then extended to allow for coalition formation in cooperative environments with precedence ordered tasks. The extended algorithm was then demonstrated to work in simulation and on a set of Pioneer-DX robots. The extended heuristic algorithm suffered from some limitations that led to the conception of the RACHNA system for competitive task environments. The paper
116
J Intell Robot Syst (2007) 50:85–118
further explores RACHNA, which is a market-based distributed task allocation scheme based on the multi-unit combinatorial auction problem. RACHNA reverses the auction scheme found in other market based coordination schemes by allowing tasks to bid on robot services rather than the other way around. Experiments were conducted to demonstrate that the system produces high quality sub-optimal solutions and enables a far more tractable formulation of the coalition formation problem by leveraging the redundancy in robot sensor capabilities. Finally, this paper investigated the idea of preemption of a complex multi-robot task to allow for environments that permit random allocation. An example of task preemption was demonstrated in simulation. Future work involves real world task experiments with the RACHNA system to demonstrate dynamic pre-emption of tasks. Other aspects that require evaluation include the robustness to communication and partial or complete robot failure.
References 1. Abdallah, S., Lesser, V.: Organization-based cooperative coalition formation. In: Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Techonology, IAT, pp. 162–168 (2004) 2. Balas, E., Padberg, M.: Set partitioning: a survey. SIAM Rev. 18, 710–760 (1976) 3. Balch, T., Arkin, R.C.: Communication in reactive multiagent robotic systems. Auton. Robots 1(1), 1–25 (1994) 4. Bar-Yehuda, R., Even, S.: A linear time approximation algorithm for the weighted vertex cover problem. J. Algorithms 2, 198–203 (1981) 5. Blum, A.L., Furst, M.L.: Fast planning through planning graph analysis. Artif. Intell. 90, 281–300 (1997) 6. Borgwardt, K.-H.: Some distribution-independent results about the asymptotic order of the average number of pivot steps of the simplex algorithm. Math. Oper. Res. 7, 441–461 (1982) 7. Botelho, S.C., Alami, R.: M+: a scheme for multi-robot cooperation through negotiated task allocation and achievement. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 1234–1238 (1999) 8. Brooks, R.A.: A robust layered control system for a mobile robot. IEEE J. Robot. Autom. 2(1), 14–23 (1986) 9. Caloud, P., Choi, W., Latombe, J.-C., Pape, C.L., Yim, M.: Indoor automation with many mobile robots. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 67–72 (1990) 10. Chu, P.C., Beasley, J.E.: A genetic algorithm for the set covering problem. Eur. J. Oper. Res. 94, 396–404 (1996) 11. Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979) 12. Collins, J., Jamison, S., Mobasher, B., Gini, M.: A market architecture for multi-agent contracting. Technical Report 97-15, University of Minnesota, Department of Computer Science (1997) 13. Dahl, T.S., Matari´c, M.J., Sukhatme, G.S.: Multi-robot task-allocation through vacancy chains. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 14–19 (2003) 14. Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton, NJ (1972) 15. DeVries, S.,Vohra, R.: Cominatorial auctions: a survey. INFORMS J. Comput. 135, 284–309 (2003) 16. Dias, M.B.: TraderBots: a new paradigm for robust and efficient multirobot coordination in dynamic environments. Ph.D. thesis, The Robotics Institute, Carnegie Mellon University (2004) 17. Dijkstra, E.W.: The mathematics behind the banker’s algorithm. In: Dijkstra, E.W. (ed.) Selected Writings on Computing: A Personal Perspective, pp. 308–312 (1977)
J Intell Robot Syst (2007) 50:85–118
117
18. Dyer, M.E., Wolsey, L.A.: Formulating the single machine sequencing problem with release dates as a mixed integer programming problem. Discrete Appl. Math. 26, 255–270 (1990) 19. Fass, L.F.: Automatic-theoretic view of agent coalitions. Technical Report WS-04-06, American Association of Artificial Intelligence Workshop on Forming and Maintaining Coalitions and Teams in Adaptive Multiagent Systems (2004) 20. Fisher, M., Kedia, P.: Optimal solution of set covering/partitioning problems using dual heuristics. Manage. Sci. 36(6), 674–688 (1990) 21. Garrido, A., Salido, M., Barber, F., López, M.: Heuristic methods for solving job-shop scheduling problems. In: PuK (2000) 22. Gasser, L.: Social knowledge and social action. In: International Joint Conference on Artificial Intelligence, pp. 751–757 (1993) 23. Gerkey, B., Matari´c, M.J.: A framework for studying multi-robot task allocation. In: Proceedings of the Multi-robot Systems: From Swarms to Intelligent Automata, vol. 2, pp. 15–26 (2003) 24. Gerkey, B., Matari´c, M.J.: Are (explicit) multi-robot coordination and multi-agent coordination really so different? In: Proceedings of the AAAI Spring Symposium on Bridging the Multi-agent and Multi-robotic Research Gap, pp. 1–3 (2004) 25. Gerkey, B.P., Matari´c, M.J.: MURDOCH: Publish/subscribe task allocation for heterogeneous agents. In: Proceedings of Autonomous Agents, pp. 203–204 (2000) 26. Gerkey, B.P., Matari´c, M.J.: Pusher–watcher: an approach to fault-tolerant tightly-coupled robot coordination. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 464–469 (2002a) 27. Gerkey, B.P., Matari´c, M.J.: Sold!: auction methods for multi-robot coordination. IEEE Trans. Robot. Autom. (Special issue on multi-robot systems) 18(5), 758–768 (2002b) 28. Gerkey, B.P., Vaughan, R.T., Stoy, K., Howard, A., Sukhatme, G.S., Matari´c, M.J.: Most valuable player: a robot device server for distributed control. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1226–1231 (2001) 29. Klee, V., Minty, G.: How good is the simplex algorithm? In: Shisha, O. (ed.) Qualities, vol. III, pp. 159–175 (1972) 30. Korte, B., Vygen, J.: Combinatorial Optimization: Theory and Optimization. Springer, Berlin (2000) 31. Laengle, T., Lueth, T.C., Rembold, U., Woern, H.: A distributed control architecture for autonomous mobile robots. Adv. Robot. 12(4), 411–431 (1998) 32. Lang, S.: An extended banker’s algorithm for deadlock avoidance. IEEE Trans. Softw. Eng. 3, 428–432 (1999) 33. Leyton-Brown, K., Shoham, Y., Tennenholtz, M.: An algorithm for multi-unit combinatorial auctions. In: Proceedings of the 17th National Conference on Artificial Intelligence, pp. 56–61 (2000) 34. Li, X., Soh, L.-K.: Investigating reinforcement learning in multiagent coalition formation. Technical report no. WS-04-06, American Association of Artificial Intelligence Workshop on Forming and Maintaining Coalitions and Teams in Adaptive Multiagent Systems, pp. 22–28 (2004) 35. Lin, L., Zheng, Z.: Combinatorial bids based multi-robot task allocation. In: Proceedings of the International Conference on Robotics and Automation, pp. 1145–1150 (2005) 36. Low, K.H., Leow, W.K., Ang Jr., M.H.: Task allocation via self-organizing swarm coalitions in distributed mobile sensor network. In: Proceedings of the American Association of Artificial Intelligence, pp. 28–33 (2004) 37. Parker, L.E.: The effect of action recognition and robot awareness in cooperative robotic teams. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 1, pp. 212–219 (1995) 38. Parker, L.E.: ALLIANCE: an architecture for fault tolerant multi-robot cooperation. IEEE Trans. Robot. Autom. 14(2), 220–240 (1998) 39. Sadeh, N.M., Fox, M.S.: Variable and value ordering heuristics for the job shop scheduling constraint satisfaction problem. Artif. Intell. 86, 1–41 (1996) 40. Sadeh, N.M., Sycara, K., Xiong, Y.: Backtracking techniques for the job shop scheduling constraint satisfaction problem. Artif. Intell. 76, 455–480 (1995) 41. Sandholm, T.: An implementation of the contract net protocol based on marginal cost calculations. In: Proceedings of the Eleventh National Conference on Artificial Intelligence, pp. 256–262 (1993) 42. Sandholm, T.: Algorithm for optimal winner determination in cominatorial auctions. Artif. Intell. 135, 1–54 (2002)
118
J Intell Robot Syst (2007) 50:85–118
43. Sandholm, T., Lesser, V.: Advantages of a leveled commitment contracting protocol. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 126–133 (1996) 44. Sandholm, T.W., Larson, K., Andersson, M., Shehory, O., Tomhe, F.: Coalition structure generation with worst case guarantees. Artif. Intell. 111(1,2), 209–238 (1999) 45. Sandholm, T.W., Lesser, V.R.: Coalition formation among bounded rational agents. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 662–669 (1995) 46. Schneider, J., Apfelbaum, D., Bagnell, D., Simmons, R.: Learning opportunity costs in multirobot market based planners. In: Proceedings of the International Conference on Robotics and Automation, pp. 1151–1156 (2005) 47. Schrijver, A.: Theory of Linear and Integer Programming. Wiley, Amsterdam (1986) 48. Shehory, O., Kraus, S.: Task allocation via coalition formation among autonomous agents. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 655–661 (1995) 49. Shehory, O., Kraus, S.: Formation of overlapping coalitions for precedence ordered taskexecution among autonomous agents. In: Proceedings of International Conference on Multi Agent Systems, pp. 330–337 (1996) 50. Shehory, O., Kraus, S.: Methods for task allocation via agent coalition formation. Artif. Intell. 101(1,2), 165–200 (1998) 51. Smith, R.G.: The contract net protocol: high level communication and control in a distributed problem solver. IEEE Trans. Comput. 29(12), 1104–1113 (1980) 52. Sorbella, R., Chella, A., Arkin, R.: Metaphor of politics: a mechanism of coalition formation. Technical Report WS-04-06, American Association of Artificial Intelligence Workshop on Forming and Maintaining Coalitions and Teams in Adaptive Multiagent Systems (2004) 53. Spielman, D.A., Teng, S.-H.: Smoothed analysis: why the simplex algorithm usually takes polynomial time. In: Proceedings of ACM Symposium on Theory of Computing, pp. 296–305 (2001) 54. Srinivasan, V.: A hybrid algorithm for the one machine sequencing problem to minimize total tardiness. Nav. Res. Logist. Q. 18, 317–327 (1971) 55. Stearns, R.E.: Convergent transfer schemes for n-person games. Trans. Am. Math. Soc. 134, 449–459 (1968) 56. Stentz, A., Dias, M.B.: A free market architecture for coordinating multiple robots. Technical Report CMU-RI-TR-01-26, The Robotics Institute, Carnegie Mellon University (1999) 57. Sycara, K., Zeng, D.: Coordination of multiple intelligent software agents. Int. J. Intell. Coop. Inf. Syst. 5(2-3), 181–211 (1996) 58. Tang, F., Parker, L.E.: Coalescing multi-robot teams through ASyMTRe: a formal analysis. In: Proceedings of the IEEE International Conference on Advanced Robotics, pp. 817–824 (2005) 59. Vig, L., Adams, J.A.: Issues in multi-robot coalition formation. In: Parker, L.E., Schneider, F.E., Schultz, A.C. (eds.) Proceedings of Multi-Robot Systems. From Swarms to Intelligent Automata, Washington, DC, vol. 3, pp. 15–26. Springer, Berlin (2005) 60. Vig, L., Adams, J.A.: Market-based multi-robot coalition formation. In: Gini, M., Voyles, R. (eds.) Distributed Autonomous Robotics Systems, Minneapolis, vol. 7, pp. 227–236. Springer, Berlin (2006a) 61. Vig, L., Adams, J.A.: Multi-robot coalition formation. IEEE T. Robot. 22(4), 637–649 (2006b) 62. Wagner, H.M.: An integer programming model for machine scheduling. Nav. Res. Loqist. Q. 6, 131–140 (1959) 63. Werger, B., Matari´c, M.J.: Broadcast of local eligibility: behavior based control for strongly cooperative multi-robot teams. In: Proceedings of Autonomous Agents, pp. 21–22 (2000) 64. Wu, L.S.: A dynamic theory for the class of games with nonempty cores. SIAM J. Appl. Math. 32, 328–338 (1977) 65. Zlot, R., Stentz, A.: Complex task allocation for multiple robots. In: Proceedings of the IEEE Conference on Robotics and Automation, pp. 67–72 (2005)