GeoInformatica 8:1, 71±98, 2004 # 2004 Kluwer Academic Publishers. Manufactured in The Netherlands.
A Probability-based Uncertainty Model for Point-in-Polygon Analysis in GIS CHUI KWAN CHEUNG,1 WENZHONG SHI,1 AND XIAN ZHOU2 1 Advanced Research Center for Spatial Information Technology, Department of Land Surveying and GeoInformatics, The Hong Kong Polytechnic University, Hunghom, Kowloon, Hong Kong E-mail:
[email protected] 2 Department of Applied Mathematics, The Hong Kong Polytechnic University, Hunghom, Kowloon, Hong Kong Received April 19, 2002; Revised September 10, 2003; Accepted September 30, 2003
Abstract In a geographical information system (GIS), a conventional point-in-polygon analysis is used to determine whether a point is located inside a polygon with a Boolean result. Due to positional uncertainties existing with both the point and the polygon, the Boolean answer cannot describe the relationship of closeness between the point and the polygon. This paper aims to develop a model and to provide a continuous indexÐthe probability valueÐto indicate the extent of the uncertain point located inside the uncertain polygon based upon existing research development. This probability index is derived based on probability and statistical theories considering the statistical uncertainty distributions of the point and the polygon's vertices. The associated mathematical expressions for the probability index are addressed in cases depending on the intersection between the polygon and the error ellipse of the point. The proposed probability index can provide an objective description of the relationship of closeness between an uncertain point and an uncertain polygon. Keywords: positional uncertainty, point-in-polygon analysis, error propagation, GIS
1.
Introduction
A point-in-polygon analysis is one of the major spatial analyses in a GIS, and normally offers a Boolean result of the question ``Is a point located inside a polygon?'' However, uncertainties of GIS data are introduced in the process of data capture ([1], [3], [6], [7], [9], [10], [14], [16]). It seems irrational to answer this question de®nitely when either a point or a polygon is uncertain. In this study, we propose an uncertainty model and derive a continuous index to show the relationship between an uncertain point and an uncertain polygon. Current uncertainty models for point-in-polygon analysis provide simpli®ed solutions of the analysis with certain assumptions. Stanfel et al. [17] estimate the probability of a precise point located inside an uncertain polygon by simulation. This probability value is determined from the location of the point with respect to the whole polygon: an error circle is created around each vertex of the polygon; exterior and interior tangents between two consecutive error circles are then formed, and two sets of the exterior tangents and of the interior tangents divide the extended polygon, which is an area bounded by the exterior
72
CHEUNG, SHI AND ZHOU
tangents, into different parts. Depending on the point located in different parts, the probability will be estimated based upon error distributions of the individual vertices of the polygon, which are supposed to be independent of each other. Stanfel et al. [17], however, neglect the uncertainty of the point to be determined whether it is located in the polygon; hence this is one of the point-in-polygon cases with uncertainty. Research by Leung and Yan [11] lists nine possibilities for the analysis with certainty (or precise) and with uncertainty (which is classi®ed into (i) random and (ii) imprecise due to the ambiguity inherent in natural phenomena): (a) precise point and precise polygon, (b) precise point and imprecise polygon, (c) precise point and random polygon, (d) imprecise point and precise polygon, (e) imprecise point and imprecise polygon, (f ) imprecise point and random polygon, (g) random point and precise polygon, (h) random point and imprecise polygon, and (i) random point and random polygon. In the meantime, the researchers clarify the fundamental issues for these nine cases in the point-in-polygon analysis based upon the assumption that all points located inside a random polygon have identical and independent error distribution. This assumption appears to describe the random polygon in a particular case, while Keefer et al. [8], [9] found that digitization error was autocorrelated and cross-correlated. An uncertainty model for providing the probability description on an uncertain point located inside an uncertain polygon is therefore proposed in this study based upon the existing development. We consider that (a) error ellipse should be used to describe the uncertainty of a point, as it is more rigorous than the error circle model, and (b) the uncertainties of the vertices of a polygon can be correlated. 1.1.
Problem de®nition
Let point P be a point with uncertainty, and simple polygon A be a polygon with uncertainty in a point-in-polygon analysis, where a simple polygon is a region enclosed by a single closed line that does not intersect itself. The probability Pr
P [ A of P located inside A is estimated analytically in this study. The main objectives of this research are: i. To model the point-in-polygon analysis under uncertainty existing with both the point and the polygon. ii. To provide a continuous index to describe the relationship of closeness between the uncertain point P and an uncertain simple polygon A objectively. iii. To examine the performance of the proposed uncertainty model of the point-inpolygon by examples. iv. To study a distribution of the probability Pr
P [ A of the uncertain point located inside the uncertain polygon. In Section 2, uncertainties of points and polygons are expressed mathematically in terms of probability density functions. Our proposed uncertainty model for de®ning the probability index is given in Sections 3 and 4 while an example in Section 5 can
A PROBABILITY-BASED UNCERTAINTY MODEL
73
demonstrate the application of the uncertainty model to sample data. A conclusion is given in Section 6. In the following, we discuss the degree of an uncertain point P
x; y located inside an uncertain simple polygon A of NP vertices. Let
x1 ; y1 ;
x2 ; y2 ; . . . ;
xNP ; yNP be A's vertices ordered clockwise or counterclockwise. 2.
Mathematical expression of uncertainty of points and polygons
Uncertainties of points and polygons are derived mathematically in this paper to cope with the point-in-polygon analysis under uncertainty existing with both the point and the polygon. We de®ne the uncertainty of a point as a base, and further present the uncertainty of the vertices of a polygon. 2.1.
Presentation of the uncertainty of a point
Positional uncertainties of a point is normally unavoidable in GIS. The uncertainties in GIS are investigated in many researchers ([1], [2], [4], [14]±[16]). For example, measurement uncertainties are introduced in a process of data capture. The measurement location of the point is usually captured by ground survey, photogrammetric or remote sensing survey, map digitizing or scanning, or a combination of these. Shi [14] divides the object capture methods into direct and indirect methods (®gure 1). Each operation in an object capture process, however, introduces measurement or process uncertainty, or both.
Figure 1.
Various data capture methods in GIS ( from Shi [14], p. 29).
74
CHEUNG, SHI AND ZHOU
A point in GIS is obtained, for example, by digitizing a map produced using the photogrammetric method. Its positional uncertainties are then caused by control, photography, aero-triangulation, orientation, compilation, drafting, printing and digitization [14]. Therefore, the distribution of the uncertainty is greatly dependent upon how the location of a point is input into a GIS. Digitizing is a widely used method to obtain the location information of a point. Many researchers have studied the distribution for digitization uncertainty. Walsby [19] stated that the digitization uncertainty was nearly normally distributed. Meng et al. [12], [13] and Tong et al. [18] showed that the digitization uncertainty approached a new random uncertainty distributionÐNL distribution in some conditions and to the least P-norm distribution respectively. Normal distribution is, however, still widely accepted for many cases, probably due to its capacity of describing the auto-correlation and cross-correlation problems of the vertices of spatial features (including points, lines and polygons). We will examine the normality assumption based on our sample data. Let fP
x; y denote the probability density function (pdf ) of P. This is the probability that any point
x; y is the true location of P. Let X
x; y, mX
mx ; my and SX
s2x
sxy
syx
s2y
!
where mx and my are the expected values of x and y respectively and SX is the covariance matrix of X. If the positional uncertainty of P follows a bivariate normal distribution, fP
x; y will be fP
x; y exp
X
mX SX 1
X mX T =2 p ;
2p jSX j
1
where Z T is the transpose of any vector Z.
2.2.
Presentation of the uncertainty of a polygon
The uncertainty distribution of a polygon is dif®cult to de®ne, as a rigorous mathematical description should be derived from the uncertainty distributions of all the points located inside the polygon and there are an in®nite number of these points in the polygon. The pdf of the polygon is actually more complicated than the pdf of a point. Therefore, we model the point-in-polygon analysis under uncertainty existing with both the point and the polygon according to the pdf of the point and the pdf of the vertices of the polygon. The detail is given in Sections 3 and 4. Let hA
x1 ; y1 ; x2 ; y2 ; . . . ; xNP ; yNP denote the pdf of all the vertices of A. If the positional uncertainty of these vertices follows a multivariate normal distribution, hA
x1 ; y1 ; x2 ; y2 ; . . . ; xNP ; yNP will become
75
A PROBABILITY-BASED UNCERTAINTY MODEL
hA
x1 ; y1 ; x2 ; y2 ; . . . ; xNP ; yNP exp
T
mX SX 1
X mX p NP
2p jSX j
1 2
X
;
2
where X
x1 ; y1 ; x2 ; y2 ; . . . ; xNP ; yNP ; mX
mx1 ; my1 ; mx2 ; my2 ; . . . ; mxNP ; myNP and 1 0 2 s x1 y1 sx1 xNP sx1 yNP sx1 Bs s2y1 sy1 xNP sy1 yNP C C B y1 x1 C B . C: B . SX B C . C B @ sxNP x1 sxNP y1 s2xNP sxNP yNP A syNP x1 syNP y1 syNP xNP s2yNP This expression can describe the cases where the vertices of the polygon are either uncorrelated or correlated. The pdf of the vertices of A is signi®cant for estimating the probability of P located inside A in this study. In the following section, Pr
P [ A will be expressed as a function of the vertices of a given location of A. An expected value of this probability index will then be derived with the pdf of the vertices of A in order to consider uncertainty of the vertices of the polygon in the point-in-polygon analysis. 3.
A mathematical model of point-in-polygon analysis with uncertainty
If polygon A is without uncertainty (or precise), the probability Pr
P [ A that P is located inside A is given by ZZ Pr
P [ A
fP
x; ydx dy;
3
x;y [ A
where fP
x; y is the pdf of P. Figure 2 shows a bell-shaped ®gure around the expected location of point P that is the pdf of P when fP
x; y is bivariate normally distributed. Also, polygon A is given in the ®gure. The probability index Pr
P [ A is equal to a volume of the bell-shaped ®gure bounded by polygon A (see the shaded region). The integral domain in equation 3 is determined by polygon A or its ordered vertices
x1 ; y1 ;
x2 ; y2 ; . . . ;
xNP ; yNP . Then, Pr
P [ A is a function of x1 ; y1 ; x2 ; y2 ; . . . ; xNP ; yNP and denoted as g
x1 ; y1 ; . . . ; xNP ; yNP . When A is uncertain, the integral in equation 3 becomes the conditional probability Pr
P [ AjA that P lies in A, given the vertices of A. In order to ®nd an unconditional probability Pr
P [ A, we integrate Pr
P [ AjA with respect to its variables
x1 ; y1 ; x2 ; y2 ; . . . ; xNP ; yNP as Pr
P [ AjA is expressed as g
x1 ; y1 ; . . . ; xNP ; yNP . This unconditional probability is the probability of uncertain point P located inside uncertain polygon A. From probability theory, the unconditional probability Pr
P [ A can be obtained from
76
CHEUNG, SHI AND ZHOU
Figure 2.
A relation between the pdf of point P and polygon A.
Pr
P [ A E
Pr
P [ AjA Z Z hA
x1 ; y1 ; . . . ; xNP ; yNP 6Pr
P [ AjA dx1 dy1 ; . . . ; dxNP dyNP 0 1 Z Z ZZ B C hA
x1 ; y1 ; . . . ; xNP ; yNP 6@ fP
x; ydx dyAdx1 dy1 ; . . . ; dxNP dyNP
x;y [ A
Z Z hA
x1 ; y1 ; . . . ; xNP ; yNP 6g
x1 ; y1 ; . . . ; xNP ; yNP dx1 dy1 ; . . . ; dxNP dyNP ;
4 where hA
x1 ; y1 ; . . . ; xNP ; yNP is the pdf of all the vertices of A. This multiple integral involves many variables of integration, such as x1 ; y1 ; . . . ; xNP ; yNP . A numerical integration technique is a way to provide an approximation. However, as the number of vertices of the polygon increases, much computation time is required. We will modify equation 4 to lower computation time to assess the uncertainty in the point-inpolygon analysis by reducing the number of variables of integration in equation 4. 4.
Modi®ed point-in-polygon model under uncertainty
When the positional uncertainty of P and the positional uncertainty of A are small, the integral domain in equation 3 can be modi®ed by considering those edges of polygon A that are close to P. The former sentence is a requisite condition in our proposed uncertainty model: small positional uncertainties of P and of A can induce stability of the intersection between uncertain A and the error ellipse of P. That is, if the expected location of an A's edge intersects the error ellipse of P, a measured location of the A's edge will always intersect the error ellipse of P. Therefore, we classify the intersection between uncertain A and the error ellipse of P in different cases based on the expected edges of A and the error ellipse of P below.
A PROBABILITY-BASED UNCERTAINTY MODEL
77
Figure 3. (a) A conventional integral domain for Pr
P [ AjA; (b) A modi®ed integral domain for Pr
P [ AjA.
In ®gure 3(a), there are (i) a quadrangle of edges 1, 2, 3 and 4, (ii) a point surrounded by its error ellipse (in dashed line), and (iii) the error ellipse on or inside which the probability of the point is located is 0.999 (i.e., Pr
P [ error ellipse 0:999, for example). The integral domain for estimating the value of Pr
P [ AjA is shaded in light gray in ®gure 3(a). In order to simplify this integral domain, we will search the expected edges of A that intersect the error ellipse of P. Figure 3(b) shows that edge 1 intersects the error ellipse; hence the integral domain can be modi®ed as the shaded region in dark gray, as shown in (b). Although an area of the light gray shaded region in ®gure 3(a) is obviously different from an area of the dark gray shaded region in (b), the Pr
P [ AjA obtained from these two regions as the integral domain should be nearly identical because the probability of P located outside the error ellipse is very small (0.001). It is also relatively simple to determine the probability based upon the edges of A intersecting the error ellipse of P instead of all the edges of A.
78
CHEUNG, SHI AND ZHOU
4.1.
Selection of edges close to the point
An error ellipse with a probability value of 0.999 is used to determine which edges of A are close to P in this study. Each edge of A is a straight-line segment of vertices
xi ; yi and
xi 1 ; yi 1 where i 1; 2; . . . ; NP and
xNP 1 ; yNP 1
x1 ; y1 . We should search all edges of A that intersect the error ellipse; however, this is time-consuming and complex because of the complexity of the mathematical expression for the error ellipse. Therefore, we compute whether the edges of A and a rectangle bounding the error ellipse of P intersectÐan approximated area of the error ellipse. This is a method also used in spatial index and can quickly provide the potential edges of A intersecting with the error ellipse of P. It is supposed that the four vertices of the rectangle are
Rx1 ; Ry1 ;
Rx2 ; Ry2 ;
Rx3 ; Ry3 and
Rx4 ; Ry4 . For the A's edge of vertices
xi ; yi and
xi 1 ; yi 1 where i 1; 2; . . . ; NP, we will decide whether there is any point
ax; ay on the edge satisfying the following rules: 8 > <
j [ f1;2;3;4g
min
Rxj ax
> :
min
Ryj ay
j [ f1;2;3;4g
max
Rxj
max
Ryj ;
j [ f1;2;3;4g j [ f1;2;3;4g
5
where (
ax rxi 1
1 ay ryi 1
1
rxi for i 1; 2; . . . ; NP and 0 r 1: ryi
6
and minj [ f1;2;3;4g Ryj ; In equation 5, minj [ f1;2;3;4g Rxj ; maxj [ f1;2;3;4g Rxj maxj [ f1;2;3;4g Ryj are distribution ranges of P in the x and the y directions. A union of these two ranges is a rectangle on or inside which the probability of P located is larger than the con®dence level of the error ellipse enclosed by it. The rectangle, for example, is shaded in ®gure 4. If
ax; ay satis®es the condition in equation 5, the rectangle will intersect or enclose the edge of
xi ; yi and
xi 1 ; yi 1 , and the error ellipse of P may interest or enclose the edge. If no point on the edge is located inside the frame, this edge will not intersect the error ellipse. After the intersection test is applied for all the edges of A, the number of edges crossing the error ellipse will be counted.
4.2.
Diversity of the intersection
The above section has de®ned a method to ®nd the edges of A that intersect the error ellipse of P. There may be different intersection(s) between the edges of A and the error ellipse of P. According to the number of the intersection points, we estimate Pr
P [ A in four different cases: (a) only one edge of the polygon intersecting the error ellipse, (b) two
A PROBABILITY-BASED UNCERTAINTY MODEL
Figure 4.
79
Intersections between the uncertain polygon and the error ellipse of the uncertain point.
edges of the polygon intersecting the error ellipse, (c) three or more edges of the polygon intersecting the error ellipse, and (d) no edges of the polygon intersecting the error ellipse. Figure 5 shows a ¯owchart for determining which of the four cases a point-in-polygon under uncertainty existing with both the point and the polygon should be automatically grouped into. We simplify the integral for Pr
P [ A according to the number of the intersection, and the modi®ed expression is derived as follows. 4.2.1. Only one edge of the polygon intersecting the error ellipse. In ®gure 3, one edge of A crosses the error ellipse of P. It is supposed that this edge is Edge
j which joins
xj ; yj and
xj 1 ; yj 1 where j 1; 2; . . . ; NP. The line passing via Edge
j is y mx c, where m
yj 1 yj =
xj 1 xj and c yj mxj when xj =xj 1 . When xj xj 1 , the line is x xj . Referring to ®gure 6, Edge
j splits the error ellipse into two parts. The shaded region in (a), which is partially located inside A, satis®es the condition that y < mx c, while the shaded region in (b) satis®es y > mx c. There are thus two possibilities for estimating Pr
P [ AjA: ZZ Pr
P [ AjA
fP
x; ydx dy;
7
fP
x; ydx dy:
8
y < mx c
ZZ
and
Pr
P [ AjA y > mx c
One of equations 7 and 8 represents Pr
P [ AjA, whereas the other represents Pr
P6 [ AjA, that is the probability of P located outside a particular A. Since Pr
P [ AjA 1 Pr
P6 [ AjA, either equation 7 or 8 can determine the value of Pr
P [ AjA based on divisions between the error ellipse and the polygon. Due to the symmetry of normal distribution, if the overlap area is greater than the area of the error
80
CHEUNG, SHI AND ZHOU
Figure 5. Determining in which of the four cases for Pr
P [ A a particular point-in-polygon should be classi®ed.
ellipse minus the area of the overlap, Pr
point [ polygon j polygon will be larger than Pr
point6 [ polygon j polygon, and vice versa. The integral domains in equations 7 and 8 are now studied and there are different possibilities for providing the range for x and y (see ®gure 7). To derive a generic solution, we will translate
x; y in equation 7 to
x mx ; y my and rotate the x- and y-axes so that the edge of A (which intersects the error ellipse of P) is
Figure 6.
Discussion on the relation between Edge
j of A and the error ellipse of P.
A PROBABILITY-BASED UNCERTAINTY MODEL
Figure 7.
81
Different cases when only one edge of a polygon intersects the error ellipse of a point.
parallel to the y-axis. Let
x0 ; y0 be the coordinate of
x; y after the translation and
x00 ; y00 be the coordinate of
x0 ; y0 after the rotation. We then have
x0 ; y0
x 00
00
mx ; y 0
0
x ; y
x ; y
my ;
9
sin a
cos a
cos a
sin a
;
10
where a tan 1
m. Putting both equations into equation 7, Pr
P [ AjA will become Z?
Zc00
Pr
P [ AjA y00
? x00
fP
x00 ; y00 jJj dx00 dy00 ;
11
?
where J is the Jacobian matrix and its determinant jJj is equal to 1 here, c00 is given as c00
xj mx sin a
yj my cos a, and fP
x00 ; y00 is the pdf of P after the translation and rotation. Based upon the rule used to determine whether Pr
point [ polygon j polygon is larger than Pr
point6 [ polygon j polygon; Pr
P [ AjA can be found. Since Pr
P [ AjA is a function of
x00j ; y00j and
x00j 1 ; y00j 1 , Pr
P [ A can be derived from ZZ ZZ Pr
P [ A R4
h
x00j ; y00j ; x00j 1 ; y00j 1 6Pr
P [ AjAdx00j dy00j dx00j 1 dy00j 1 ;
12
82
CHEUNG, SHI AND ZHOU
Figure 8.
An error ellipse of P of different expected locations intersecting two edges of A.
where h
x00j ; y00j ; x00j 1 ; y00j 1 is an error distribution of vertices
x00j ; y00j and
x00j 1 ; y00j 1 of A for j 1; 2; . . . ; NP after the translation and the rotation. 4.2.2. Two edges of the polygon intersecting the error ellipse. When there are two edges of A intersecting the error ellipse of P, these two edges should be considered to de®ne the integral domain. There are two cases in ®gure 8: (a) the two edges have a common vertex and (b) the two edges are not connected. Case A. It is supposed that Edge
j and Edge
j 1 have a common vertex. As in the previous section, the coordinate system is translated and rotated so that the origin of the new coordinate system becomes
mx ; my and Edge
j is parallel to the y-axis of the new coordinate system. All combinations of Edge
j and Edge
j 1 are shown in ®gure 9. The shaded region is the corresponding integral domain, which is formed by Edge
j (represented by a thick solid line) and Edge
j 1 (represented by a thick dashed line). We can group them into different cases and then express Pr
P [ AjA or Pr
P6 [ AjA in terms of a multiple integral. Let b be an interior angle between Edge
j and Edge
j 1, and the coordinate of
xi ; yi in the new coordinate system be
x00i ; y00i . Then, lines passing via Edge
j and Edge
j 1 are expressed as x00 c1 and y00 m2 x00 c2 respectively. Case 1: b < 90 degrees and
x00j ; y00j is above
x00j 1 ; y00j 1 (as in ®gure 9(a)). Let Pr be Pr
P [ AjA or Pr
P6 [ AjA. Then, Pr is given as Z? Pr y00 y00
j1
y00
x00
Zc2 =m2 c1
fP
x00 ; y00 dx00 dy00 ;
13
where ? may be replaced with a ®nite value for numerical computation and this ®nite
A PROBABILITY-BASED UNCERTAINTY MODEL
83
Figure 9. Possible integral domains for estimating Pr
P [ AjA when two edges of A intersect the error ellipse of P.
value must be greater than the maximum magnitude of the semi-axes of the error ellipse of P Case 2: b < 90 degrees and
x00j ; y00j is above
x00j 1 ; y00j 1 (as ®gure 9(i)). Pr is given as Z? Pr x00 c1
Z? y00 m2 x00 c2
00 00 00 00 fP
x ; y dy dx :
14
84
CHEUNG, SHI AND ZHOU
Case 3: b < 90 degrees and
x00j ; y00j is below
x00j 1 ; y00j 1 (as in ®gure 9(c)). Pr is given as Zy00j 1 Pr y00 ?
Zc1 x00 y00
c2 =m2
00 00 00 00 fP
x ; y dy dx :
15
Case 4: b > 90 degrees,
x00j ; y00j is below
x00j 1 ; y00j 1 (as in ®gure 9(k)). Pr is given as Zc1 Pr x00 ?
m2Z x00 c2
y00
?
00 00 00 00 fP
x ; y dy dx :
16
Case 5: b 90 degrees,
x00j ; y00j is above
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 is on the righthand side of
x00j 1 ; y00j 1 (as in ®gure 9(e)). Pr is given as Z? Pr y00 y00
j1
Z? x00
x00j 1
00 00 00 00 fP
x ; y dy dx :
17
Case 6: b 90 degrees,
x00j ; y00j is above
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 is on the left-hand side of
x00j 1 ; y00j 1 (as in ®gure 9(u)). Pr is given as Z? Pr y00 y00
j1
00
Zxj 1 x00
?
fP
x00 ; y00 dy00 dx00 :
18
Case 7: b 90 degrees,
x00j ; y00j is below
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 is on the left-hand side of
x00j 1 ; y00j 1 (as in ®gure 9(g)). Pr is given as Zy00j 1 Pr y00 ?
00
Zxj 1 x00
?
00 00 00 00 fP
x ; y dy dx :
19
Case 8: b 90 degrees,
x00j ; y00j is below
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 is on the righthand side of
x00j 1 ; y00j 1 (as in ®gure 9(w)).
A PROBABILITY-BASED UNCERTAINTY MODEL
85
Pr is given as Zy00j 1 Pr y00 ?
Z? x00 x00j 1
00 00 00 00 fP
x ; y dy dx :
20
Based on the value of b, the locations of
x00j ; y00j ,
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 , the feasible formula for the probability can be identi®ed. After choosing one of the above formulas from equations 13 to 20, we should determine whether Pr is a measure of Pr
P [ AjA or Pr
P6 [ AjA; details will be given at the end of this section. Case B. We now consider an instance where the two edges of A are disjoined. Let these two edges be Edge
j and Edge
k where j j kj > 1. The coordinate system is ®rst translated and rotated until the origin of the new coordinate system becomes
mx ; my and Edge
j is parallel to the y-axis of the new coordinate system. When an extension of Edge
j intersects an extension of Edge
k,
x00j 1 ; y00j 1 should be replaced with
int_x; int_y in the equation given in the approximately eight cases where
int_x; int_y represents the coordinate of an intersection point of Edge
j and Edge
k that is given by (
int x c1 int y m2 c1 c2 :
21
When Edge
j is parallel to Edge
k, Pr becomes Z? Pr y00 ?
Zc2 x00 c1
fP
x00 ; y00 dy00 dx00 :
22
In the case where only one edge of A intersects P, the error ellipse of P and polygon A overlap each other and their overlap is used to consider whether Pr
P [ AjA is greater (or smaller) than Pr
P6 [ AjA. This rule will also be used in the event that there are two edges of A intersecting P, to identify Pr representing Pr
P [ AjA or 1 Pr
P [ AjA. Since Pr
P [ AjA is a function of
x00j ; y00j ,
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 , Pr
P [ A can be derived from Z Pr
P [ A
Z
h
x00j ; y00j ; x00j 1 ; y00j 1 ; x00j 2 ; y00j 2
R6
6Pr
P [ AjAdx00j dy00j dx00j 1 dy00j 1 dx00j 2 dy00j 2 ;
23
86
CHEUNG, SHI AND ZHOU
Figure 10. Three edges of A intersect the error ellipse of P.
where h
x00j ; y00j ; x00j 1 ; y00j 1 ; x00j 2 ; y00j 2 is an error distribution of vertices
x00j ; y00j ,
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 of A for j 1; 2; . . . ; NP. This section provides different mathematical expressions (from equations 13 to 23) for estimating the probability index when two edges of the polygon intersect the error ellipse of the point. We can determine which of the mathematical expressions for Pr
P [ AjA should be implemented based upon the relation between the expected locations of the two edges, which intersect the error ellipse of P. Putting the feasible expression for Pr
P [ AjA into equation 23 can be used to estimate Pr
P [ A.
4.2.3. Three or more edges of the polygon intersecting the error ellipse. We will ®rst consider the simple instance that three consecutive edges of A (e.g., Edge
j, Edge
j 1, Edge
j 2) intersect the error ellipse of P (see ®gure 10). In order to calculate Pr, the integral domain will be partitioned by a line which passes through the common point of Edge
j 1 and Edge
j 2 and is parallel to Edge
j, given as the dashed line in ®gure 11, where two corresponding partite regions (call B1 and B2) are also shown. The integral domain is then either B1 union B2 or B1 minus B2. An integral domain in Pr is determined by Edge
j, Edge
j 1 and the dashed line shown in ®gure 11. Since the line passing via Edge
j is x00 c1 after the translation and the rotation of the coordinate system, the dashed line is given as x00 x00j 2 :
24
Figure 12 shows different combinations of Edge
j, Edge
j 1 and the line passing through
xj 2; yj 2 and parallel to Edge
j after the translation and the rotation of the original coordination system. Table 1 shows a multiple integral for these different combinations. Based on Pr and the intersection between A and the error ellipse of P, Pr
P [ B1 jB1 can be computed. Moreover, Pr
P [ B2 jB2 can be de®ned similarly as the case where two edges of A intersect the error ellipse of P, given that Edge
j and Edge
j 1 become a new edge Edge
j0 formed by translating Edge
j until
xj 1; yj 1 becomes
xj 2; yj 2 and the edge Edge
j 2 respectively. Then, Pr
P [ B1 and Pr
P [ B2 are
A PROBABILITY-BASED UNCERTAINTY MODEL
87
Figure 11. Examples of a case where three edges of a polygon intersect the error ellipse of a point.
ZZ Pr
P [ B1
ZZ
h
x00j ; y00j ; . . . ; x00j 2 ; y00j 2
6Pr
P [ B1 jB1 dx00j dy00j ; . . . dx00j 2 ; dy00j 2 ; ZZ Pr
P [ B2
ZZ
25
h
~ x00j ; y~00j ; x00j 2 ; y00j 2 ; x00j 3 ; y00j 3
6Pr
P [ B2 j B2 d~ x00j d~ y00j ; . . . ; dx00j 3 ; dy00j 3 ;
26
Figure 12. Different combinations of Edge
j, Edge
j 1 and the line passing through
xj 2; yj 2 and parallel to Edge
j.
88
CHEUNG, SHI AND ZHOU
Table 1. Different cases when three consecutive edges of a polygon intersect the error ellipse of a point. Examples in Figure 12
Case 1. b < 90 degrees and
x00j ; y00j is above
x00j1 ; y00j1
(a) and (f )
Pr R y00 R R x00 ? j2 R y00 c2 =m2 00 00 00 00 00 00 00 00 fP
x ; y dx dy y00 y00 x00j2 c1 fP
x ; y dx dy y00 y00 x00 c1
2. b > 90 degrees and
x00j ; y00j is above
x00j1 ; y00j1
(c) and (d)
R y00 R R x00 ? j1 R x00j2 00 00 00 00 y00 y00 x00j2 c1 fP
x ; y dx dy y00 y00 x00 y00
00 00 00 00 c2 =m2 fP
x ; y dx dy
3. b < 90 degrees and
x00j ; y00j is below
x00j1 ; y00j1
(g) and (l)
R 00 y j2 R x00j2 00 00 00 00 R y00j1 R c1 y00 ? x00 c1 fP
x ; y dx dy y00 y00 x00 y00
00 00 00 00 c2 =m2 fP
x ; y dx dy
4. b > 90 degrees and
x00j ; y00j is below
x00j1 ; y00j1
(i) and ( j)
R y00 j1 y00
5. b 90 degrees and
x00j ; y00j is above
x00j1 ; y00j1
(b) and (e)
R R x00 ? 00 00 00 00 y00 y00 x00j2 x00 fP
x ; y dx dy
6. b 90 degrees and
x00j ; y00j is below
x00j1 ; y00j1
(h) and (k)
R y00 j1 y00
j1
j2
j2
j1
j2
?
00 00 00 00 x00 c1 fP
x ; y dx dy
R y00 R x00j2 y00j2 y00 x00 y00 j1
00 00 00 00 c2 =m2 fP
x ; y dx dy
j1
j1
?
R x00j2
R x00j2
00 00 00 00 x00 x00j1 fP
x ; y dx dy
where
x00j ; y00j and
x00j 1 ; y00j 1 are the vertices of Edge
j,
x00j 1 ; y00j 1 and
x00j 2 ; y00j 2 are the vertices of Edge
j 1, and
x00j 2 ; y00j 2 and
x00j 3 ; y00j 3 are vertices of Edge
j 2, and
~ x00j ; y~00j and
x00j 2 ; y00j 2 are vertices of Edge
j0 in the new coordination system. The ®nal step for Pr
P [ A is to compute the union of Pr
P [ B1 and Pr
P [ B2 . Let g be an angle between Edge
j0 and Edge
j 2. In ®gure 13, it is noticed that when b plus g is greater than 180 degrees, Pr
P [ A Pr
P [ B1 Pr
P [ B2 . Conversely, when b plus g is less than 180 degrees, Pr
P [ A jPr
P [ B1 Pr
P [ B2 j. The above instance ignores the fact that Edge
j, Edge
j 1 and Edge
j 2 are joined together to form a triangle, i.e.,
xj 3; yj 3 is equal to
xj; yj (see ®gure 14). It is supposed that the coordinate system has been translated and rotated as stated in the previous cases. Edge
j will be parallel to the y-axis. Let lines passing via Edge
j 1 and Edge
j 2 in the new coordinate system be y00 mj 1 x00 cj 1 and y00 mj 2 x00 cj 2 respectively. Pr is then given as 00 Zxj 2 Pr
P [ A x00 x00 j
mj 1 x00 cj 1
Z
y00 mj 2 x00 cj 2
00 00 00 00 fP*
x ; y dy dx :
27
A PROBABILITY-BASED UNCERTAINTY MODEL
89
Figure 13. A relation between two partite regions of the integral domain when three consecutive edges of a polygon intersect the error ellipse of a point.
When three non-consecutive edges or more than three edges of A intersect the error ellipse of P, we will partition the integral domain into several mutually exclusive regions (e.g., B1, B2 and B3 in ®gure 15) which is bounded by two edges or three consecutive edges. For each partition, the corresponding probability that P is located inside it will be computed. The sum of the probabilities for all partitions is equal to Pr
P [ A. 4.2.4. No edges of a polygon intersecting the error ellipse of a point. When there is no edge of A intersecting the error ellipse of P, two extreme cases occur: (a) P is located inside Awith a large probability value, and (b) P is located inside Awith a small probability value. In other words, the error ellipse is located either completely outside A or inside A (see ®gure 16). In addition to the two extreme cases, A may be located inside the error ellipse of P entirely. However, this does not occur in GIS because the error ellipse of an arbitrary point feature should not be greater than any point feature in GIS. We thus consider the two extreme cases here by verifying the error ellipse is completely located inside A by testing whether P is located inside A.
Figure 14. A triangle formed by Edge
j, Edge
j 1 and Edge
j 2.
90
CHEUNG, SHI AND ZHOU
Figure 15. A partition of the integral domain for estimating the probability value when three non-consecutive edges or more than three edges of A intersect the error ellipse of P.
When the error ellipse of P occurs outside the polygon, we will ®nd that edge of A which is the nearest to P and then calculate Pr
P [ AjA in the following manner, which is quite similar to the case where only one edge of A intersects the error ellipse of P. It is supposed that Edge
j of A is the nearest to P. We will translate and rotate the coordinate system as above, hence Edge
j is parallel to the y-axis in the new coordinate system. Pr can be calculated by
Figure 16. The error ellipse of a point completely located outside or inside a polygon.
A PROBABILITY-BASED UNCERTAINTY MODEL
Z?
Z? Pr y00
?
x00
fP*
x00 ; y00 dy00 dx00 :
91
28
x00j
Since the error ellipse of P occurs completely outside A, Pr
P [ AjA must be smaller than Pr
P6 [ AjA. Based on this fact and Pr
P [ AjA Pr
P6 [ AjA 1, we can determine which Pr
P [ AjA or Pr
P6 [ AjA is equal to Pr, and Pr
P [ AjA is then obtained. That is, when Pr > 0:5, Pr
P [ AjA 1 Pr. Otherwise, we will have Pr
P [ AjA Pr. Integrating Pr
P [ AjA with respect to the related vertices of A can derive Pr
P [ A. Where the error ellipse of P is located inside the polygon, we will calculate Pr
P [ A by considering all edges of A one by one. In other words, we will compute Pr as given in the case where the error ellipse of P occurs outside the polygon with each edge of A in each time. It is also true that Pr
P [ A for each edge of A must be larger than Pr
P 6 [ A. Thus, the minimum value of Pr
P [ A for each edge of A is equal to Pr
P [ A for the whole A. This Pr
P [ A should be greater than Pr
P [ error ellipse. To simplify the calculation for Pr
P [ A in this case, Pr
P [ A can be assigned to 1. Next, an example is given to demonstrate how to estimate Pr
P [ A with random polygon A. Suppose that uncertain P is assumed to be normally distributed with mean
0; 5 and variance matrix
0:02 0
0 ; 0:02
uncertain polygon A is of four vertices
x1; y1,
x2; y2,
x3; y3, and
x4; y4, which expected locations are
0; 0,
0; 10,
10; 10, and
10; 0 respectively. Then, uncertain P should be close to an edge of A of endpoints
x1; y1 and
x4; y4. For the sake of convenience, the error distributions of
x1; y1 and
x4; y4 are supposed to be discrete and independent. After the translation and rotation as mentioned above, the following assumptions are made in this example: 8 < 0:9 h1
x001 ; y001 0:1 : 0 8 < 0:9 h4
x004 ; y004 0:1 : 0
if x001 0 and y001 0 if x001 0:1 and y001 0:1; and otherwise if if
x004 x004
29 y004
10 and 0 10:1 and y004 0:1
otherwise.
Therefore, hA
x001 ; y001 ; x004 ; y004 h1
x001 ; y001 6h4
x004 ; y004 in equation 12.
92
CHEUNG, SHI AND ZHOU
Probability Pr
P [ A in equation 12 becomes ZZZZ Pr
P [ A
0
h
x001 ; y001 6h
x004 ; y004 Z?
B 6@
y00
?
x1 sin a
Z
y1
x00
5 cos a
1 C fP
x00 ; y00 dx00 dy00Adx001 dy001 dx004 dy004 :
30
?
Since h
x001 ; y001 and h
x004 ; y004 are discrete, the above formula can be then simpli®ed as follows: 0 B Pr
P [ A h1
0; 06h4
10; 06@
y00
0
B h1
0; 06h4
10:1; 0:16@
? x00
h1
0:1; 0:16h4
10; 0 0 Z? 0:1 sin
0:0101Z 4:9 cos
B 6@ ?
x00
?
0
B h1
0:1; 0:16h4
10:1; 0:16@
C 00 fP
x00 ; y00 dx00 dyA ? 5 cos
0:0099 Z
Z? y00
y00
1
Z5
Z?
?
x00
0:0101
Z? y00
1 C fP
x00 ; y00 A
?
1 C fP
x00 ; y00 dx00 dy00A Z4:9
? x00
1 C 00 : fP
x00 ; y00 dx00 dyA
31
?
Therefore, an approximation to Pr
P [ A can be computed based on equation 31. This value of Pr
P [ A is the expectation of Pr
P [ AjA in statistics. In this section, we have derived mathematical expressions for the probability of point P located inside polygon A under uncertainty existing with both P and A, Pr
P [ A. The mathematical expressions are all expressed in terms of multiple integrals based upon probability and statistical theories. According to the number of intersection points between the expected edges of A and the error ellipse of P, an appropriate mathematical expression for Pr
P [ A can be determined under the requisite condition that the positional uncertainties of P and of A are relatively small. However, the appropriate mathematical expression for Pr
P [ A cannot be solved analytically. The numerical solution should be estimated to obtain an approximation. For example, Gaussian quadrature numerical integration technique can provide the numerical solution.
93
A PROBABILITY-BASED UNCERTAINTY MODEL
5.
Examples
A set of sample data were captured and used to examine the performance of the proposed uncertainty model of the point-in-polygon analysis and furthermore to study the probability distribution of an uncertain point located inside an uncertain polygon. Buildings on a 1 : 1,000 scale digital topographical map of Hong Kong are digitized 30 times, where the choice of sample size is determined by a two-stage procedure given in Desu and Raghavara [5]. Based on a set of the digitized points, we can thus examine the distribution of digitization errors of each vertex of the buildings. In our sample data, it is found that there is no evidence to reject the hypothesis that the digitization errors follow a multivariate normal distribution within signi®cant level equal to 0.05. The mean locations of buildings A, B and C are shown in ®gure 17 and their coordinates on the ground are given as: i
Amxi ; Amyi
Bmxi ; Bmyi
1 2 3 4
836,308.741; 817,721.918
836,355.263; 817,763.162
836,378.948; 817,736.396
836,332.665; 817,695.263
836,340.429;
836,370.471;
836,413.461;
836,383.500;
Cmxi ; Cmyi 817,788.577 817,814.755 817,766.646 817,740.216
836,252.940; 817,767.113)
836,342.167; 817,846.765)
836,366.943; 817,818.934)
836,277.877; 817,740.019)
Unit: m on ground
We then estimate probabilities of sample points located inside building A. These sample points are some points on a line passing through
Bx4; By4 and
Cx4; Cy4 (or
x; y r6
Bx4 ; By4
1 r6
Cx4 ; Cy4 where r [ 0; 1). When r 0, for example, the expected location of P is considered as
836,277.877 m; 817,740.019 m. Since the error ellipse of
836,277.877 m; 817,740.019 m does not intersect the expected location of A, we estimate Pr
P [ A based on equation 28 and
Figure 17. The expected location of buildings A, B and C.
94
CHEUNG, SHI AND ZHOU
obtain Pr
P [ A 0 under a condition that the error ellipse of P is located outside the expected location of A. When r 0:485, the expected location of P is
836,328.999 m; 817,740.114 m. The error ellipse of P in this case intersects the line segment joining
Amx2 ; Amy2 and
Amx3 ; Amy3 . Based on equation 12, Pr
P [ A approximates to 7:384610 3 . When r 0:500, the error ellipse of P is located inside the expected location of A; then we estimate Pr
P [ A based on equation 28 and obtain Pr
P [ A 1 under a condition that the error ellipse of P is located inside the expected location of A. Similarly, we can estimate Pr
P [ A for r [ 0; 1. Figure 18 shows Pr
P [ A varying from r 0 to r 1 in (a), varying from r 0:484 to 0.490 in (b), and varying from r 0:923 to 0.929 in (c). In fact, ®gures 18(b) and (c) show parts of ®gure 18(a) to study a change of Pr
P [ A when P is approaching to A. It is found that when the range of r is from 0 to 0.484 or from 0.929 to 1, the probability index is 0 and hence those points on the line satisfying this condition will be completely outside the uncertain building A. When r increases from 0.484 to 0.490, the probability index also increases from 0 to 1 and the uncertain point is approaching to the uncertain polygon. When the range of r is from 0.490 to 0.923, the probability index is 1 and those points on the line satisfying this condition will be completely inside the uncertain building A. When r increases from 0.923 to 0.929, the probability index decreases from 1 to 0 and the closeness relation becomes weak. We can then conclude that as an uncertain point is approaching to the boundary of an uncertain polygon, its probability index steadily increases to 1. Conversely, as an uncertain point goes away from the boundary, the probability index steadily decreases to 0. This conclusion is different from a sudden rise of the probability from 0 to 1 at r 0:925 (see ®gure 18(c)) or a sudden drop from 1 to 0 at r 0:4865 (see ®gure 18(b)) in a case where the particular point goes away or approaches to the particular point. Therefore, the proposed uncertainty model provides a feasible probability index to describe the relationship of closeness between an uncertain point and an uncertain polygon. 6.
Conclusions
In a point-in-polygon analysis, the point and the polygon are uncertain due to random errors introduced in the process of data capture. The existing research studies model the uncertainties of the point and polygon in the assumptions that the uncertainties of all points located inside the polygon follow a circular normal distribution identically and independently. In this paper, we propose a probability-based uncertainty model for a point-in-polygon analysis in a more generic case regarding the existing research problems, including (a) both the point and the polygon are uncertain, (b) error ellipse of the point, which is more rigorous than the error circle model, should be used to describe the uncertainty of a point, and (c) the uncertainties of the vertices of a polygon may be correlated and different from each other. In order to provide the probability of the uncertain point located inside the uncertain polygon, we have described the uncertainties of the point and of the vertices of the polygon in terms of probability density functions. The probability of an uncertain point located
A PROBABILITY-BASED UNCERTAINTY MODEL
95
Figure 18. Relation between points on the line passing via
Bx4; By4 and
Cx4; Cy4 and the probability values of these points located inside building A.
96
CHEUNG, SHI AND ZHOU
inside an uncertain polygon has then been derived in terms of multiple integrals based on probability and statistical theories. Since this expression involves many variables of integration, we have divided the point-in-polygon analysis into different cases depending on the intersection between the polygon and the error ellipse of the point. The mathematical expressions for the probability in the individual cases have also been addressed. In GIS, there is a diversity of spatial analyzes. Results derived from the individual spatial analyses are affected by the uncertainties of raw spatial data and the propagated error in the spatial analysis. It is, however, dif®cult to derive a general mathematical model to describe the in¯uence of the uncertainties of the raw spatial data, because the characteristics of each spatial analysis are different. In this study, the uncertainty in a point-in-polygon analysis is modelled. Our proposed model is a feasible solution for a point-in-polygon analysis with uncertainty in GIS. It can provide a probability index to show the degree of an uncertain point located inside an uncertain polygon. A further extension of this research will be to implement this model in a GIS and to visualize the effect of the uncertainties in a point-in-polygon spatially. Further research can also be extended from the random error of GIS features to the imprecision of the spatial features. Acknowledgment The work described in this paper was fully supported by the funds from the CRC scheme, Research Grant Council of The Hong Kong SAR (Project No. 3_ZB40) and The Hong Kong Polytechnic University (Project No.: G-W001 and PolyU 5071/01E).. References 1. P.A. Burrough. Principles of Geographical Information Systems for Land Resources Assessment. Clarendon Press: Oxford, 1986. 2. P.A. Burrough and R.A. McDonnell. Principles of Geographical Information Systems. Oxford University Press, 1998. 3. P.V. Bolstad, P. Gessler, and T.M. Lillesand. ``Positional uncertainty in manually digitised map object,'' International Journal of Geographical Information Systems, Vol. 4(4):399±412, 1990. 4. C.K. Cheung and W.Z. Shi. ``Measuring uncertainty of spatial features in a three-dimensional geographical information system based upon numerical analysis,'' Geographic Information Sciences, Vol. 7(2):124±130, 2001. 5. M.M. Desu and D. Raghavarao. Sample Size Methodology. Academic Press Inc., 1990. 6. R. Dunn, A.R. Harrison, and J.C. White. ``Positional accuracy and measurement error in digital databases of land use: An empirical study,'' International Journal of Geographical Information System, Vol. 4:385±398, 1990. 7. G.B.M. Heuvelink. Error Propagation in Environmental Modelling with GIS. Taylor and Francis Ltd., 1998. 8. B.J. Keefer, J.L. Smith, and T.G. Gregoire. ``Simulating manual digitising error with statistical models,'' in GIS/LIS'88, 475±483, 1988. 9. B.J. Keefer, J.L. Smith, and T.G. Gregoire. ``Modelling and evaluating the effects of stream mode digitising errors on map variables,'' Photogrammetric Engineering & Remote Sensing, Vol. 57(7):957±963, 1991. 10. G. Maf®ni, M. Arno, and W. Bityterlich. ``Observations and comments on the generation of error in digital GIS data,'' in Accuracy of Spatial Databases, edited by M.F. Goodchild and S. Gopal, 55±67, 1989.
A PROBABILITY-BASED UNCERTAINTY MODEL
97
11. Y. Leung and J.P. Yan. ``Point-in-polygon analysis under certainty and uncertainty,'' GeoInformatica, Vol. 1:93±114, 1997. 12. X.L. Meng, D.J. Liu, and Z.H. Zhu. ``NL distribution test of map digitisation errors,'' Journal of Tongji University, Vol. 24(5):525±529, 1996. 13. X.L. Meng, W.Z. Shi, and D.J. Liu. ``Statistical tests of the distribution of errors in manually digitised cartographic lines,'' Geographic Information Science, Vol. 4(1±2):52±58, 1998. 14. W.Z. Shi. Modelling Positional and Thematic Uncertainty in Integration of GIS and Remote Sensing. ITC Publication, Enschede, No. 22, 1994. 15. W.Z. Shi, ``A generic statistical approach for modeling errors of geometric features in GIS,'' International Journal of Geographical Information Science, Vol. 12(2):131±143, 1998. 16. W.Z. Shi and W.B. Liu, ``A stochastic process-based model for positional error of line segments in GIS, '' International Journal of Geographical Information Science, Vol. 14(1):51±66, 2000. 17. L.E. Stanfel, M. Conerly, and C. Stanfel. ``Reliability of polygonal boundary of land parcel,'' Journal of Surveying Engineering, Vol. 121(4):163±176, 1995. 18. X.H. Tong, W.Z. Shi, and D.J. Liu. ``Error distribution, error tests and processing of digitised object in GIS,'' in Proc.: Accuracy 2000, Amsterdam, July, 642±646, 2000. 19. J.C. Walsby. ``The causes and effects of manual digitising on error creation in object input to GIS,'' in Innovations in GIS 2, edited by P. Fisher, 113±122, 1995.
Chui Kwan Cheung is currently working at Advanced Research Centre for Spatial Information Technology, Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong. She received her Ph.D. and M.Phil. degrees in GIS, and B.Sc. degree in Applied Mathematics from the Hong Kong Polytechnic University as well. Her research interests include mathematical models, uncertainty in spatial data and spatial analyses, and GIS applications in vehicle navigation.
Wenzhong Shi is Director of Advanced Research Centre for Spatial Information Technology, and Associate Professor, Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University. He received his doctoral degree from University of OsnabruÈck in Vechta, Germany. Current research interests include GIS, remote sensing and virtual reality, spatial data quality, 3-D and dynamic data modeling in GIS, design and development of GIS, integration of GIS and remote sensing, and feature extraction from remotely sensed images.
98
CHEUNG, SHI AND ZHOU
Xian Zhou was born in Shanghai, China, in 1954. After working in a Chinese medicine shop for over six years from 1971, he entered Shanghai Jiao Tong University in 1977 for undergraduate study, initially majoring in nuclear power. Half year later, he was transferred to the Department of Applied Mathematics, which was newly established by the university in 1998, where he completed his B.Sc. degree in Applied Mathematics in 1982. In 1983, Dr. Zhou went to University of California at Santa Barbara (UCSB) to continue postgraduate study, where he obtained a Master's degree in Statistics in 1994 and a Ph.D. degree in Mathematics in 1988. After spending two years teaching in UCSB, Dr. Zhou moved to University of Western Australia as a Lecturer in 1990 and stayed there until 1996 before accepting a new job at the Hong Kong Polytechnic University (HK PolyU). He is currently an Associate Professor at the Department of Applied Mathematics in the HK PolyU. Dr. Zhou has a range of research interests in the areas of probability and statistics as well as operations research, including probability theory, statistical inference, limiting theory, survival analysis, and stochastic scheduling. He co-authored a book titled ``Survival Analysis with Long-term Survivors'', published by Wiley. He has numerous publications in various journals including Annals of Statistics, Journal of American Statistical Association, Journal of Multivariate Analysis, Biometrika, Biometrics, Statistica Sinica; Operations Research, Naval Research Logistics, Probability in Engineering and Informational Sciences, Annals of Operations Research, and so on.