TOP DOI 10.1007/s11750-016-0410-7 ORIGINAL PAPER
Clustering intelligent transportation sensors using public transportation Tejswaroop Geetla1 · Rajan Batta1,2 · Alan Blatt2 · Marie Flanigan2 · Kevin Majka2
Received: 23 December 2014 / Accepted: 5 February 2016 © Sociedad de Estadística e Investigación Operativa 2016
Abstract Advanced transportation sensors use a wireless medium to communicate and use data fusion techniques to provide complete information. Large-scale use of intelligent transportation sensors can lead to data bottlenecks in an ad-hoc wireless sensor network, which needs to be reliable and should provide a framework to sensors that constantly join and leave the network. A possible solution is to use public transportation vehicles as data fusion nodes or cluster heads. This paper presents a mathematical programming approach to use public transportation vehicles as cluster heads. The mathematical programming solution seeks to maximize benefit achieved by covering both mobile and stationary sensors, while considering cost/penalty associated with changing cluster head locations. A simulation is developed to capture realistic considerations of a transportation network. This simulation is used to validate the solution provided by the mathematical model. Keywords
Sensor placement · Data fusion · Simulation · Optimization methods
Mathematics Subject Classification
90CXX · 65K05 · 00AXX
1 Introduction Modern day road transportation has a significant role in the well being of an economy and improves the quality of life for all people. However, with the increased speed at
B
Rajan Batta
[email protected]
1
Department of Industrial and Systems Engineering, University at Buffalo (SUNY), Buffalo, NY, USA
2
Center for Transportation Injury Research, CUBRC, Buffalo, NY, USA
123
T. Geetla et al.
which goods and passengers are transported there are dangers associated with road travel. As reported, in 2013, 35,200 fatalities occurred on US highways and 3.8 million crash injuries required medical attention. These dangers in road transportation and the increase in travel time due to congestion caused by road accidents have prompted highway transportation authorities to examine various resources to help alleviate the situation. One of the many ways this danger can be minimized and the transportation infrastructure improved, is through use of advanced sensors. A sensor in this context can be any instrument that measures a physical quantity or provides information on an event/incident, and includes inductive loop detectors, traffic cameras, advanced automated crash notification (AACN) and infrared traffic flow sensors. These sensors measure the speed at which traffic moves on roads and detect flow disruptions due to traffic incidents. The Federal Highway Administration (FHWA) has invested in sensor research and infrastructure with a goal to monitor the road transportation system and improve its safety and reliability. For example, a smart phone accompanying a passenger in every car serves as an intelligent sensor monitoring the geographic position and speed of the car. This active monitoring is superior to road-side sensors which are passive and monitor only a certain road stretch. Another important feature of wireless sensors is the combination of data from multiple sensors, often referred to as multi-sensor data fusion. Multi-sensor data fusion helps in authenticating the occurrence of an event and provides multiple data points to measure and estimate a particular event. This process avoids sensor errors and provides statistically significant measurements. A typical Intelligent Transportation Systems (ITS) uses many sensors for incident detection and management. These sensors can be classified as stationary and mobile. Stationary sensors are fixed on the road side whereas mobile sensors are on vehicles. Sensors on public vehicles (e.g., buses) are prime candidates for not just sensing, but also for performing data fusion operations. Data fusion is the application of mathematical techniques so as to combine data from multiple sensors. As an example, sound, change of speed and weather all can provide a comprehensive picture of crash. Data fusion was initially developed for Department of Defense applications in the 1980’s, and utilized tools from statistics, signal processing and artificial intelligence. Hall and Llinas (1997) contains many illustrative examples of the use of data fusion to improve information quality and event characterization. Wireless sensor networks (WSN’s) that include mobile and stationary sensors have no fixed architecture. Since mobile sensors are associated with moving vehicles, they move in and out of transportation network under consideration. Every time a new sensor joins this network, it has to self-organize itself to the present network architecture. This type of self-organizing network architecture is commonly referred to as an adhoc network. Another unique feature of such a wireless sensor network is the ability to perform local data fusion to decrease data transfer requirements between sensors and to minimize the formation of data bottlenecks. In large ad-hoc wireless sensor networks, sensors group themselves into clusters. Each cluster chooses a cluster head, which collects data from each sensor in the cluster, fuses its data and transmits the fused information to a central location. This cluster head can be one of the sensors in the cluster. Essentially, the cluster head serves as a data fusion and transmission node
123
Clustering intelligent transportation sensors…
for the cluster. Cluster formation in large wireless sensor networks helps in minimizing data bottlenecks, conserves the energy of sensors with limited battery capacity, and stabilizes the network. Intelligent transportation sensors that are similar to other wireless sensor networks can form ad-hoc networks and choose to form clusters to avoid data bottlenecks and use data fusion techniques to improve their estimation. In some of the clustering mechanisms employed in wireless sensor networks, cluster heads can be similar to all the other sensors in the field or cluster heads can have special characteristics that enable them to be cluster heads. Mainwaring et al. (2002) proposes the use of weather sensors to monitor a large area, in which any sensor can function as a potential cluster head. In this work, we assume that all sensors in the network can communicate locally, however, only a specific set of sensors can function as cluster heads. In real-road wireless sensor networks there are a myriad of sensors interacting with each other; some of the sensors are owned by road network users and some by infrastructure managers. Sensors owned by road users may not be available for cluster head responsibilities and may also have limited functionality. Keeping this constraint, we chose limited sensors, potentially public transport vehicles to act as cluster heads. To summarize, the focus of this work is to improve the use of intelligent transportation sensors (stationary and mobile), which have wireless communication capability and are capable of transmitting their data over short distances. These sensors need clustering to form an ad-hoc network where data are transmitted between multiple sensor classes. Cluster formation by choosing a small set of cluster heads, helps stabilize the network, avoids data duplication and minimizes the formation of data bottlenecks. In our model, we assume that a set of sensors with added features serve as cluster heads/data fusion motes. In real world applications these special set of sensors can be public transport vehicles (transit bus, school bus, etc.) that have the potential to serve as cluster heads. Our assumption is that public transport vehicles can be equipped with sensors and computational abilities. This makes them both mobile sensors and ideal candidates to serve as cluster heads as they have the computational power to perform data fusion operations. Another key characteristic of public transport vehicles is that their schedule is often known in advance and thus we have the ability to plan the cluster head choice much more systematically than that for mobile sensors associated with private automobiles. In the mathematical model that serves as an approximation, we assume that the location of both the potential cluster heads and the mobile and stationary sensors is available for the time horizon chosen. Whereas, in the simulation model we assume only the potential cluster heads follow a schedule and that mobile sensors move randomly according to the average annual traffic data. The remainder of this paper is structured to include background literature, mathematical formulation, case study and a simulation framework. In Sect. 2, we present a survey on related work in the area of applications-based clustering in sensor networks. In Sect. 3, the clustering problem is formulated into an Integer Linear Program (ILP). In Sect. 4, we develop multiple test case problems, to test the problem formulation using CPLEX. In Sect. 5, we present results from solving the clustering problem formulation using CPLEX. In Sect. 6, we develop a simulation to evaluate the solutions. In Sect. 7, we summarize the findings and present future research directions.
123
T. Geetla et al.
2 Related work in clustering in sensor networks Wireless ad-hoc sensors and their capabilities to share information and form networks, creates an intelligent transportation network. These capabilities evolved over time. We present relevant research on clustering algorithms in sensor networks. Most WSN’s use some sort of grouping into clusters to handle large number of sensors and the data generated from all the sensors. In each cluster, the group of sensors selects a cluster head, collects all the sensor data inside its cluster and transmits this information to the base center (or a central location) where all the data are collected. Clustering is introduced to WSN, because it has proven to be an effective approach to provide improved data aggregation and scalability for large WSN (Boyinbode et al. 2010). Clustering creates a hierarchical network topology and the routing table information is maintained by the cluster heads. Clustering solves the network topology and location management necessary for an ad-hoc network. A clustering mechanism can also choose the zone routing protocol (ZRP) where zones can be the clusters. Forming clusters in a large group of sensors involves two tasks. First, the sensors group themselves into separate/disjoint sets and second each cluster selects a cluster head. Younis and Fahmy (2004) presents an energy saving cluster formation, Basu et al. (2001) presents a MOBIC algorithm for clustering in an environment where the sensors move and clustering is done to minimize the communication distance. To motivate the type of WSN we are considering, we assert that an effective trafficincident management system needs the following functions: (a) rapid detection of crashes; (b) characterization of a crash and crash scene for incident management purposes; (c) crash related information to drivers and highway managers to reduce congestion. Sensors are used in incident-detection technology. These sensors are either stationary (e.g., acoustic and traffic cameras) or mobile (mounted on vehicles). A WSN in this context needs to deal with data fusion and sensor management associated with both stationary and mobile sensors. Data fusion applies mathematical techniques to combine data from multiple sensors. Application of this technique to incident detection helps combine data from multiple sensor types to detect traffic incidents. Based on the WSN description above, we now discuss the methods used for sensor clustering in two separate categories, one which uses distributed/centralized heuristics and the other which uses mathematical programming techniques.
2.1 Clustering using distributed/centralized heuristics In a recent paper, Mehr (2014) presents a cluster head selection scheme for WSN. This builds upon previous work by Heinzelman et al. (2002) who develop a LEACH method to achieve energy efficiency in the communication between sensor nodes. Another related work is the Weighted Clustering Algorithm (WCA) which selects a node as a cluster head based on a number of factors, and further restricts the number of sensors in a cluster. In recognition of an iterative design, Chan and Perrig (2004) developed an ACE algorithm that uses three rounds of feedback. Yet another method is based on the chemical recognition system of ants (Labroche et al. 2002). All of these methods and many more, which are eloquently discussed in the review paper
123
Clustering intelligent transportation sensors…
by Afsar and Tayarani (2014), use heuristic algorithms that are either distributed or centralized. Furthermore, the focus is on situations when scalability is a major issue and in which solutions are needed in a short amount of time due to the quick evolution of the sensor network. 2.2 Clustering using mathematical programming approaches Sorokin et al. (2009) presents a review of mathematical programming techniques used for sensor networks. These include problems in sensor network localization, sensor scheduling, and communication network interdiction. There are few examples, however, of mathematical programming techniques being used in clustering of sensor networks. Patel et al. (2005) is one such effort. The approach is suitable for cases where scalability is not of paramount importance, and where solution quality (and provability of solution quality) is important. Since this approach is what we build upon, it is reviewed below in some depth. It should be mentioned that other researchers have attempted to use mathematical programming formulations for clustering sensor networks, c.f. the paper by Furuta et al. (2009). Patel et al. (2005) focuses on optimally locating cluster heads (fusion nodes), which collect data from sensors within a range, for all the time periods, so as to maximize the coverage of the sensors and minimize the relocation of these cluster heads. Patel et al. (2005) assumes that the sensor movement patterns are known and ignores specific communication issues. In Patel et al. (2005) the authors formulate the optimal location of cluster heads in a threat environment, where the sensors are under constant threat from the enemy and maximize the communication capabilities for the mission duration, as an integer programming problem. Note that the crucial difference between the rulebased algorithms discussed in Sect. 2.1 and Patel et al. (2005) is that the latter assumes the positions of all sensors are known for all time periods. This assumption allows them to formulate the problem as an ILP. Patel et al. (2005) formulates the problem as a variant of maximal expected coverage location problem (MEXCLP), which is derived from maximal coverage location problem. MEXCLP is an NP-hard problem and the formulation is for multiple time periods, making the problem difficult to solve using the branch and bound solution methodologies usually applied to IP formulations. Patel et al. (2005) proposes a column generation heuristic (CG) approach to solve the IP formulation. The results indicate that the CG approach is very effective. This CG approach has been extended to other instances of the clustering problem. 2.3 Relationship of our work with the literature Our work lies within the realm of using mathematical programming techniques for clustering. More specifically, we develop upon the MEXCLP variant examined in Patel et al. (2005), in which a discrete time model is established to allow for movement of sensors as time evolves and the possibility of reassignment of cluster heads to sensors that are better positioned to communicate and fuse information. By reassigning cluster heads as time evolves, more efficient communication and processing of sensor data
123
T. Geetla et al.
are established. However, too frequent reassignment of cluster heads leads to overhead loss, as newly established cluster heads have to receive all of the connectivity tables and information for sensors which they are responsible for.
3 Formulation of the clustering problem in road traffic networks An integer linear programming formulation is developed for the clustering of wireless intelligent transportation sensors. 3.1 Variables The decision variables for the clustering problem are: (1) xk,t , this variable is assigned 1 if trip k is chosen as a cluster head for time period t and takes a value of 0 otherwise. (2) Variable γs,k,t is assigned 1 if the mobile sensor s is assigned to trip k in time period t and is assigned a value 0 otherwise. This variable chooses sensors assigned to each cluster head. (3) Variable zl,t,k is assigned 1 if the stationary sensor l is assigned to trip k in time period t and is assigned a value 0 otherwise. This variable selects a suitable stationary sensor near a cluster head. (4) Variable νk,t is assigned a value of 1 if trip k is chosen as cluster head in time period t, but changes the cluster head location in time period t + 1 or if trip k is chosen as cluster head in time period t, but is not a cluster head in time period t − 1. This variable keeps track of the number of cluster head changes in time horizon. (5) Variable wl,t is assigned a value of 1 if stationary sensor l is assigned to a cluster head at time t, 0 otherwise. (6) Variable ws,t is assigned a value of 1 if mobile sensor s is assigned to a cluster head at time t, 0 otherwise. 3.2 Input data The input data needed for the model are (1) ϒs,k,t , is given a value of 1 if the mobile sensor s is in the communication range of trip k in time period t and takes a value of 0 if the sensor is outside the communication range. (2) Z l,k,t , is given a value of 1 if the stationary sensor l is in the communication range of trip k in time period t and takes a value 0 if its outside the communication range. If the communication range of the cluster head or the sensor changes, both ϒs,k,t and Z l,k,t must be updated. (3) Sk , E k are the start and end times of each trip. Similar to a bus trip in public transport system, our cluster head has a predefined route, start time and end time. A potential cluster head can function as a cluster head only if the time period falls in between the start and end time. (4) Sk,t is given a value 1 if trip k is moving through the road network in time period t. If there are changes in the trip schedule or more trips are added to the model the values of Sk,t needs to be updated. (5) Cs,k,t defined as the benefit for covering a mobile sensor s by trip k in time period t. The model is formulated so as to maximize the coverage, so if a mobile sensor is covered, it is described as a profit. Similarly (6) Cl,k,t is defined as the profit for covering a stationary sensor l by trip k in time period t. As discussed earlier, stationary sensors contain valuable information like
123
Clustering intelligent transportation sensors…
sound levels (acoustic sensor), traffic counts (traffic count sensors), weather (weather information). Thus, covering such sensors is of importance and is explicitly modeled by the term Cl,k,t . (7) C is defined as the cost of changing a cluster head. In real world clustering, stability of a cluster is important to avoid constant exchange of routing table. A routing table is a data table consisting of information about the position of all the existing sensors in the network. Whenever a cluster head changes, this routing table needs to be exchanged between the old cluster head and the new one. To avoid this constant exchange, a penalty is added every time there is a change in cluster heads. (8) Bk is defined as the bandwidth of each trip k. The value of Bk is assumed to constant in every time period. A bandwidth of a communication device can be imagined as a resource. This resource can be shared with all the intelligent sensors communicating with the cluster head. If the data transmitted to the cluster head exceeds the allowed bandwidth, the cluster head rejects any new data packets received in the time period. 3.3 Constraints γs,k,t ≤ xk,t ∀ k, t
(1)
γs,k,t ≤ ϒs,k,t ∀ s, k, t xk,t ≤ Sk,t ∀ k, t
(2) (3)
zl,k,t ≤ xk,t ∀ k, t zl,k,t ≤ Z l,k,t ∀ l, k, t γs,k,t ∗ ws,t + zl,k,t ∗ wl,t ≤ Bk ∀ k, t
(4) (5)
s
(6)
l
νk,t ≥ xk,t−1 − xk,t ∀ k, t νk,t ≥ xk,t − xk,t−1 ∀ k, t xk,t ≤ N ∀ t
(7) (8) (9)
k
γs,k,t ≤ 1 ∀ s, t
(10)
zl,k,t ≤ 1 ∀ l, t
(11)
k
k
Constraint (1) makes sure that the mobile sensors are only assigned to a bus trip only if the bus trip is chosen as a cluster head. Constraint (2) permits only those mobile sensors which are in communication range of the cluster head. Constraint (3) allows the cluster head to be chosen only if the potential cluster head between the start and end time of the trip. Constraint (4) assigns stationary sensors to a potential cluster head only if the potential cluster head is chosen to act as a cluster head. Constraint (5) assigns stationary sensors to a cluster head only if the stationary sensor is in the communication range of the cluster head. Constraint (6) treats the bandwidth as a resource; each cluster head has a limited bandwidth to be shared with all the mobile and stationary sensors. If a potential cluster head is chosen to act as a cluster head, its limited bandwidth needs to be assigned to mobile and stationary sensors. Constraints
123
T. Geetla et al.
(7) and (8) define variable νk,t . Constraint (9) puts a limit on the maximum number of cluster heads in every time period. This constraint improves the network stability and minimizes the data overload at the central base station where all the sensor information is sent. Constraints (10) and (11) assumes that the sensor shares its data with only one cluster head. These constraints avoid data duplication. 3.4 Objective function Maximize
Cs,k,t ∗ γs,k,t +
s,k,t
l,k,t
Cl,k,t ∗ zl,k,t −
C ∗ νk,t
(12)
k,t
The first term in (12) counts the coverage of mobile sensors, the second term counts the coverage of stationary sensors, and the third term subtracts the cost/penalty for changing the cluster heads. Taken together, (12) represents the profit achieved due to cluster head selection. In Sect. 4, test cases are developed to use CPLEX® to solve the integer programming problem in several instances. These tests give insights into the computation time and other factors affecting the problem.
4 Study area for case study We develop multiple test cases to understand the computation effort needed to solve the clustering problem using CPLEX. In Henchey et al. (2013) and Geetla et al. (2014) a study area was chosen to understand the working of intelligent transportation sensors in a data fusion environment. We use the same study area. The area chosen is near the SUNY at Buffalo North Campus and consists of a major urban-arterial road network with Interstate highways passing through the area. This area sees significant traffic during the daytime and has very high traffic volumes during the morning and evening rush hours. The simulation model developed in Henchey et al. (2013) uses average annual traffic data from Greater Buffalo-Niagara Regional Transportational Council (2010). We use the same data to construct the test case model. The test case model derived from Geetla et al. (2014) and Henchey et al. (2013) lacks the feature of public transport vehicles. These data for public transport vehicles are added as an additional layer to the derived model. For this input we use Fig. 1. Figure 1 has 20 nodes, each node is assumed to be a fixed point on the map. We assume that each node is exactly one time period away from an adjacent node. Figure 1, also presents the 60 stationary sensors spread in the study area. Each stationary sensor is assumed to have fixed communication range and communicates with a public transport vehicle moving inside its communication range. 4.1 Case study 1 Case study 1 is a small-scale problem to be solved in CPLEX® from the formulation developed in Sect. 3.3. For this case study we assumed that there are 60 public transport bus trips. Each bus trip starts at a specific time period from one of the nodes and stops its route after some time at a different node. Table 1 shows the start and end times
123
Clustering intelligent transportation sensors…
Fig. 1 Study area with stationary sensors
of the trips and also shows the routes chosen for study case 1. Using this trip and route data a list of all stationary sensors covered by each trip for every time period is prepared as input for CPLEX® . In this study case we assumed that there are 100 mobile sensors moving in the area chosen and the clustering problem is solved for 10 time periods using CPLEX® . In this test case, we assumed that each cluster head has a limited bandwidth of 100 units and each mobile sensor if chosen needs 4 units of the bandwidth. The stationary sensors are assumed to require higher bandwidth. In this test case problem a stationary sensor needs 25 units of bandwidth if chosen by a cluster head. The solution achieved through CPLEX reaches an optimal solution in 1 min with its default settings. One of the implications from this result is for a similar small-scale sensor problems CPLEX will be very effective. 4.2 Case study 2 Case study 2 is an extension of case study 1. Keeping the number of trips at 60 and stationary sensors at 60, we increase the number of mobile sensors from 100 to 500. We use the bandwidth assumptions from the case study 1 for this case study. As the 500
123
T. Geetla et al. Table 1 Trip data for case study 1 Trip number
Route
Start time
1
1–2–8–9–10
4
8
2
20–8–2–1
1
4
3
1–3–4
1
3
4
4–3–1–2
4
7
5
1–3–4–16
2
5
6
8–9
1
2
7
10–9–8–2
1
4
8
18–19
7
8
9
2–6–5
2
4
10
19–18–17–15–16
1
5
11
12–11–6–5–7–14–13–18
2
9
12
6–5
3
4
13
20–8–15–16
4
7
14
3–6–11
1
3
15
4–3–1–2–6–5–7–14
2
9
16
1–2–6–5–7
1
5
17
5–6–2–8–9
4
8
18
12–13–14
6
8
19
5–11
2
3
20
11–5
1
2
21
17–18
1
2
22
13–18
8
9
23
15–8–9–10
0
3
24
14–7–5–16–4–3–1–2
0
7
25
10–12–18–19
0
3
26
10–12–11–6–3–4–5
3
9
27
1–2–8–9–10
2
6
28
4–3–1–2
2
5
29
4–3–1–2
4
7
30
15–16–5
3
5
31
18–19–14–13
3
7
32
15–17–18–19
3
7
33
2–6
7
8
34
19–18–12–10
2
5
35
10–18–12–10
5
8
36
2–8–9–10
5
9
37
5–6
6
7
38
16–4–3–1
6
10
39
15–8–2
6
8
40
3–1
6
7
123
End time
Clustering intelligent transportation sensors… Table 1 continued Trip number
Route
Start time
End time
41
12–13–14
7
9
42
18–17–11–13
7
10
43
4–16–5–11–15
1
5
44
17–15–16
2
4
45
12–17–11–15–8–9–10
2
8
46
16–4–3–6–5
6
10
47
7–5–6–2
0
3
48
20–8–15–11–17–18–19
0
6
49
19–18–17–11–15–8–20
2
8
50
4–3–1–2–6–5–16
1
7
51
20–8–15–16–4–3–1–2
2
9
52
11–5
8
9
53
9–8–2
8
10
54
1–3
9
10
55
5–6
9
10
56
4–16–5
0
2
57
20–8–9–10
0
3
58
10–9–8–20
0
3
59
6–11
8
9
60
3–1
8
9
mobile sensors move in road network, they come in close proximity of the potential cluster heads, which follow a predefined trips, with a fixed time schedule. The results from this case study indicate that CPLEX is capable of solving medium sized problems. CPLEX reaches a good solution in 6 min of computation time, though it is observed that CPLEX has some trouble reaching optimality. We note that case study 2 is a larger problem with more constraints. Keeping the number of trips at 60 keeps the problem relatively easy. The solution space for choosing the cluster head remains the same from case study 1. 4.3 Case study 3 Case study 3 is developed to understand the increase in computation time of the clustering problem as the size of the problem increases. We increase the size of the problem by considering 100 trips instead of 60. The results from this case study proves our intuition that CPLEX is incapable of solving large instances of the problem. This addition of 40 more trips makes the problem more complex and after 30 min of computation results in a gap of 294.84 %. This gap is the difference between the upper bound and the lower bound divided by the best available solution. This result for large-scale instances of the problem compelled
123
T. Geetla et al. Table 2 Summary of results from CPLEX computation Computation time (min)
Optimality gap
Case study 1
0.98
0 (optimal)
Case study 2
6.41
0 (optimal)
Case study 3
30
294.84 %
us to try other options. The other options explored here include heuristic solution methodologies, warm starts and CPLEX parameter tuning (Table 2).
5 Heuristic solutions, CPLEX tuning and warm starts To reach a reasonable solution in a limited computation time we turn to heuristic solutions, CPLEX parameter tuning and CPLEX warm starts. All these techniques have found significant success in other clustering problems. In this section, we explore these three techniques and present the computational results from case study 3. 5.1 Pre-mature CPLEX Pre-mature CPLEX is a name coined in Erdemir et al. (2008) to represent a time constrained CPLEX. For the case study problem presented in Sect. 4.3, we chose 10 min as the time constraint and found that CPLEX will in general find more than 1 solution. The gap between the upper bound and lower bound is 600 % after 10 min. Unlike other heuristics, pre-mature CPLEX may result in multiple feasible solutions. In our case, pre-mature CPLEX yielded 2 solutions. 5.2 No-relocation heuristic (NRH) and relocation heuristic (RH) No-relocation and relocation heuristic are borrowed ideas from Patel et al. (2005) to generate feasible solutions to the clustering problem. No-relocation heuristic chooses one set of cluster heads that serve in all time periods, whereas the relocation heuristic selects the best cluster heads in each time period. Both the heuristic solutions are easy to compute, intuitive and yield feasible solutions. No-relocation heuristic makes the ad-hoc wireless sensor network system stable whereas the relocation heuristic may force the system to an unstable network formation. For case study 3, no-relocation and relocation heuristics yield different solutions. When the two solutions are compared, relocation heuristic solution had a negative objective function, whereas no-relocation heuristic yielded solution with an objective very close to the optimality. This observation is very specific to the case study 3 and maybe a result of choosing a high value for the exchange penalty. Note that computations of heuristic solutions relax bandwidth constraints. Both heuristics ignore other constraints in the mathematical model and assume that every stationary and mobile sensor inside the detection radius is covered by the cluster head. This assumption
123
Clustering intelligent transportation sensors… Table 3 Comparing the three heuristics
Objective function value Pre-mature CPLEX default
5873
NRH
30,942
RH
−61,937
results in generating infeasible solutions in some cases and results in higher objective functions. 5.3 Comparing the three heuristics Results indicate that the no-relocation heuristic always generates a good solution. In some cases, the no-relocation heuristic may present a solution very close to optimality. We conclude that depending on the profit for coverage and the penalty for cluster head exchange, one of relocation or no-relocation heuristic may function well. Pre-mature CPLEX heuristic is more computationally intensive and yields good solutions in most cases. However, compared to the other heuristics presented in this section, premature CPLEX did not yield better solutions (Table 3). 5.4 CPLEX tuning CPLEX uses numerous parameters that define the branching and bounding rules. CPLEX provides the users access to these parameters to tune the branch and bound algorithm to function effectively for specific problems. We explored this CPLEX tuning option to improve the quality and objective function value of the clustering problem. Programmers at ILOG CPLEX suggest that users of the algorithm explore the three parameters: probe value, MIP emphasis and RINS heuristic frequency. Probe value parameter can take three values 1, 2 and 3. In default settings, probe value is set to 1. A higher probe value setting explores for more solutions from the current branch and bound tree. A higher value for probe value also increases the computation time, since the algorithm searches for new feasible solutions. The second CPLEX parameter chosen for tuning is MIP emphasis. The MIP emphasis parameter is set to 1, if the user is not searching for optimal solution, but is looking for feasible solutions. Higher values of MIP emphasis (2 and 3) force the CPLEX to search for optimal solution and places less emphasis on generating feasible solution. The third CPLEX parameter chosen for tuning is RINS heuristic frequency. RINS is short for relaxation induced neighborhood search, RINS heuristic relaxes some of the constraints at a node and searches for feasible solutions. RINS frequency parameter sets a value for how often to use this heuristic. If RINS heuristic frequency is 20, the RINS heuristic stops the branch and bound after 20 node intervals and searches for solutions around the current solution. This RINS heuristic search can be turned off by setting the value of the parameter to −1. Setting the value to 0 (zero), the default, applies the RINS heuristic at an interval chosen automatically by CPLEX (2005).
123
T. Geetla et al. Table 4 Results from CPLEX tuning
Objective function value after 30 min of CPLEX computation
Optimality gap (%)
Default CPLEX
10,407
294.84
MIP emphasis default; probe default; RINS 25
29,855
37.85
MIP emphasis 1; probe default; RINS default
31,442
30.69
MIP emphasis 1; probe default; RINS 25
29,855
37.85
Table 4 presents results which showed promise during our experiments. The best parameters for a variation of the case study 3 problem indicates that MIP emphasis value at 1 is the best setting. Though Table 4 presents a small set of experiment results, we found that all the other combinations of the parameter settings did worse compared to the selected three. 5.5 Warm start A warm start in CPLEX refers to providing the algorithm with a pre-found/partial solution to the problem. The heuristics no-relocation and relocation heuristic can both function as warm starts for the CPLEX model. From the results observed on a variation case study 3, the no-relocation heuristic generates good initial solution with a higher objective function value and helped CPLEX converge to a 32.98 % gap between the upper and lower bound in 30 min. Whereas, the relocation heuristic has a very low objective function and seems to work less effectively than CPLEX with warm start using the no-relocation heuristic (Table 5).
6 Simulation Simulation is a key tool in evaluation of sensor clustering traffic environment, since the traffic movement is a stochastic system. Henchey et al. (2013) develop a simulation methodology to imitate traffic and road networks. The simulation proposed also allows users to integrate sensors into the road systems to measure and evaluate sensor performance. Geetla et al. (2014) adopt the simulation model from Henchey et al. (2013)
123
Clustering intelligent transportation sensors… Table 5 Warm start CPLEX results Objective function value after 30 min of computation time
Optimality gap (%)
294.84
CPLEX without warm start
10,407
CPLEX with NRH
30,942
32.98
CPLEX with RH
19,783
107.98
to use in the evaluation of omnidirectional sensors for accident detection capabilities. Following this approach, we present the use of the simulation as an evaluation tool for the clustering problem. 6.1 Simulation as an evaluation tool Henchey et al. (2013) develops Automated Situational Awareness Platform (ASAP) using the Arena simulation software to understand the working and quantify the use of modern sensors in a road network system. The platform’s unique feature includes traffic, road signals, sensors and data fusion techniques in a single platform provides the users with many capabilities. In the sensor placement problem, Geetla et al. (2014) modified the simulation from Henchey et al. (2013) to include directional and omnidirectional sensors. This modified simulation measures the effectiveness of sensor placement. This modified simulation has three modules: (1) traffic generation module, (2) accident generation module, (3) sensor module. Traffic generation module generates the road network, including the car and truck entities, traffic signals and movement of the traffic in the system. This traffic generation module uses average traffic movement data from 2004–2009. The accident generation module generates crashes based on motor vehicle crash data from 2004–2009. The sensor module includes all the sensors and models all the sensor behavior in the simulation. A modified simulation constructed in Geetla et al. (2014) evaluates the solutions generated for the clustering problem. We developed a ‘Clustering Evaluation’ module in the simulation to suit the needs of the clustering problem. The first responsibility of this module is to route potential cluster heads/public transport vehicles (buses) that move along with traffic. The one characteristic/assumption of these potential cluster heads is that they follow a specific time schedule, i.e., if they reach a bus stop earlier than their scheduled time, these buses wait until the scheduled time at the bus stop. We modeled this behavior into the simulation model using a template for every road segment in the simulation model. The next role of the clustering evaluation module is to count all the mobile and stationary sensors that are in communication range of a potential cluster head. A potential cluster head travels through the traffic and its traveling speed on the road depends on the congestion levels. This dynamic movement through the road network causes the potential cluster head to encounter a large number of mobile sensors. These
123
T. Geetla et al.
mobile sensors are assumed to be inside the car and truck entities generated in the traffic creation module. In the case study 3 we assume 100 bus trips, 60 stationary sensors and 500 mobile sensors in the clustering evaluation module, and generate the fixed number of sensors in the road network at various traffic creation points at the start of the simulation. The mobile sensors are assumed to be automobiles or trucks, and stationary sensors are assumed to be road-count sensors. Examples of mobile sensors in our application include cell phones and AACN sensors. Incorporating this module into the ASAP platform allows the simulation to track the movement of the potential cluster heads and evaluate the number of mobile and stationary sensors that are in communication radius of the potential cluster head. For this simulation we turn off the accident creation module that creates a crash and crash related sensor data. The sensor module assumes data packets are received when a sensor comes into the communication radius of a potential cluster head. 6.2 Simulation methodology We now describe the simulation methodology used to evaluate solutions. A solution to the clustering problem selects a specific set of potential cluster heads in a time period. We select these chosen bus routes from all the trip data and simulate only the selected trips using the clustering evaluation model. For each solution, we run the simulation 50 times to capture the variance in the performance measurements. After each simulation, we calculate the sensor coverage of all the cluster heads and calculate the “objective function” performance measure while incorporating the constraints of bandwidth. This value is stored and the simulation proceeds to the next simulation run. 6.3 Simulation results: the benefits of clustering One of the many factors that affect the performance measure “objective function” is the number of cluster heads. As the number of cluster heads increase we achieve higher mobile sensor coverage, increasing the “objective function”. To explore the changes in “objective function”, we use case study 3 to provide solutions for five different values of input for number of cluster heads. We chose to vary the number of cluster heads value between 4 and 8. This solution from the CPLEX model is used according to the simulation methodology described in Sect. 6.1. Figure 2 shows the change in objective function value as the number of cluster heads increases. The 95 % confidence intervals calculated from 50 simulation runs are also presented along with the mean values. It is observed that the variance for all the cases is small compared to the change in the objective function by adding an extra cluster head. One of the observations from Fig. 2 is that when the number of cluster heads changes from 6 to 7 the mean value of the objective function increases from 20,799 to 21,459 an increase of 660. When the number of cluster heads increases from 7 to 8 cluster heads the objective function increases to 21,998, an increase of 539. Comparing 660 to 539 we believe that as the number of cluster heads the incremental gain in the
123
Clustering intelligent transportation sensors…
Fig. 2 Change in the ‘objective function’ with increase in number of cluster heads
objective function decreases. We recognize that increasing the number of cluster heads is equivalent to an “increase” in clustering. The higher the degree of clustering (i.e., more cluster heads) the greater the benefit in terms of data collected. As expected, there is a decreasing marginal return when additional clustering is performed.
7 Summary and future research We use a mathematical formulation approach for clustering of intelligent transportation sensors. We formulated this clustering problem in a scenario where public transport vehicles act as potential cluster heads and cars/trucks traveling on the road network act as mobile sensors. We also assumed that the potential cluster heads follow a specific schedule whereas the mobile sensors move randomly in the system. We formulated this discrete optimization problem using average annual traffic flow data. To understand the computation effort needed to solve this formulation, we designed case studies using the road network surrounding the University at Buffalo (UB), North Campus. We designed multiple instances of the case study to understand the factors that influence computation time. Observing the increase in computation time and optimality gap for large instances, we proposed CPLEX tuning, warm starts and heuristics to find good feasible solutions and decrease MIP computation time. From CPLEX tuning experiments we found that an MIP emphasis value of 1 and RINS Frequency value of 25 were ideal for our case study. From the heuristic solutions, we observed that the no-relocation heuristic yields good initial solutions and is found to be an ideal warm start for CPLEX. We developed a simulation module for testing the solutions from the mathematical model. This simulation module measures the coverage achieved by the potential cluster heads. We found that CPLEX was inadequate to handle the large data sets and was slow to reach good solutions. A proven approach in clustering problems to improve the
123
T. Geetla et al.
solution time is to use “Column Generation”. In Patel et al. (2005) the authors develop both the master and sub-problem necessary for a dynamic clustering problem. This is a suggested future research direction. Another research direction is to compare rule-based clustering heuristics with a mathematical formulation approach for this application. Acknowledgments The authors would like to thank the two anonymous who provided us with excellent advice and guidance to strengthen and improve our paper. This material is based on work supported by the FHWA under Cooperative Agreement No. DTFH61-07-H-00023, awarded to the Center for Transportation Injury Research, CUBRC, Inc., Buffalo, NY. Any opinions, findings and conclusions are those of the author(s) and do not necessarily reflect the view of the FHWA.
References Afsar M, Tayarani M (2014) Clustering in sensor networks: a literature survey. J Netw Comput Appl 46:196–226 Basu P, Khan N, Little T (2001) A mobility based metric for clustering in mobile ad hoc networks. In: 2001 international conference on distributed computing systems workshop, pp 413–418 Boyinbode O, Le H, Mbogho A, Takizawa M, Poliah, R (2010) A survey on clustering algorithms for wireless sensor networks. In: 2010 13th international conference on network-based information systems (NBiS), pp 358–364 Chan H, Perrig A (2004) Ace: an emergent algorithm for highly uniform cluster formation. In: Karl H, Wolisz A, Willig A (eds) Lecture notes in computer science, vol 2920. Springer, Berlin, Heidelberg, pp 154–171 Erdemir ET, Batta R, Rogerson P, Speilman S, Blatt A, Flanigan M (2008) Location coverage models with demand originating from nodes and paths: application to cellular network design. Eur J Oper Res 190(3):610–633 Furuta T, Sasaki M, Ishizaki F, Suzuki A, Miyazawa H (2009) A new clustering model of wireless sensor networks using facility location theory. J Oper Res Soc Jpn 52(4):366–376 Geetla T, Batta R, Blatt A, Flanigan M, Majka K (2014) Optimal placement of omnidirectional sensors in a transportation network for effective emergency response and crash characterization. Transport Res Part C: Emerg Technol 45:64–82 Greater Buffalo-Niagara Regional Transportational Council (2010) Geographic information system. http:// www.gbnrtc.org/index.php/planning/gis/. Accessed 3 June 2013 Hall D, Llinas J (1997) Introduction to multi-sensor data fusion. Proc IEEE 85:6–23 Heinzelman WB, Chandrakasan AP, Balakrishnan H (2002) An application-specific protocol architecture for wireless microsensor networks. IEEE Trans Wirel Commun 1(4):660–670 Henchey M, Batta R, Blatt A, Flanigan M, Majka K (2013) A simulation approach to studying emergency response in an advanced transportation system. J Simul 8:115–128 IBM ILOG CPLEX Support (2005) Cplex performance tuning for mixed integer programs. http://www-01. ibm.com/support/docview.wssuidswg21400023. Accessed 3 June 2013 Labroche N, Monmarche N, Venturini G (2002) A new clustering algorithm based on the chemical recognition system of ants. In: Proceedings of ECAI, pp 345–349 Mainwaring A, Culler D, Polastre J, Szewczyk R, Anderson J (2002) Wireless sensor networks for habitat monitoring. In: Proceedings of the 1st ACM international workshop on wireless sensor networks and applications, pp 88–97 Mehr M (2014) Cluster head election using imperialist competitive algorithm (chei) for wireless sensor networks. Int J Mobile Netw Commun Telemat 4(3):1–9 Patel D, Batta R, Nagi R (2005) Clustering sensors in wireless ad hoc networks operating in a threat environment. Oper Res 53(3):432–442 Sorokin A, Boyko N, Boginski V, Uryasev S, Pardalos PM (2009) Mathematical programming techniques for sensor networks. Algorithms 1:565–581 Younis O, Fahmy S (2004) Heed: a hybrid, energy-efficient distributed clustering approach for ad hoc sensor networks. IEEE Trans Mobile Comput 3(4):366–379
123