Mathematical Geology, Vol. 15, No. 2, 1983
Conditionai Analysis for Petroleum Resource Evaluations 1 P. J. Lee 2 and P. C. C. Wang 3
Conditional analysis is an essential component for evaluation o f petroleum resources. Such analysi« attempts to answer the questions: for the input data and a given play potential, what is the most likely number o f pools and their sizes? An example is provided to illustrate the usage and application o f this conditional method o f analysis. This example demonstrates that the method can be used as a feedback mechanism to challenge the underlying geological concepts o f the play; and to yield more certain predictions i f additional information is given. Consequently, the method will enhance the reliability o f the estimates o f petroleum resources.
KEY WORDS: oll, gas, play potential distribution, rth largest pool size, conditional probability INTRODUCTION A common task of petroleum geologists involves the derivation of an estimate of the oll and gas potential of a play or basin. The process of making a resource estimate varies greatly because of different geological perspectives and methods employed. Consequently, significantly different estimates of resources may occur for the same hydrocarbon-bearing formation. In this paper, a feedback mechanism is developed as an attempt to resolve discrepancies between different estimates, and to validate basic input ingredients such as the play risk, the number of prospects, and the conditional pool size distribution. The successful application of the method will, of course, be finally dependent upon the interpretation of the relevant geologic setting. We shall use
iManuscript received 19 Marcia 1982; revised 3 August 1982. Paper presented at the First MGUS Conference on the Management, Analysis, and Display of Geoseience Data, 27-29 January 1982, Golden, Colorado. 2Institute of Sedimentary and Petroleum Geology, Geological Survey of Canada, 330333rd Street N.W., Calgary, Alberta, Canada T2L 2A7. 3Department of Statistics, The University of Calgary, 2500 University Drive N.W., Calgary, Alberta, Canada T2N 1N4. 349 0020-5958/83/0400-0349503.00/0
© 1983 Plenum Publishing Corporation
350
Lee and Wang
the same framework for petroleum resource evaluation as in Lee and Wang (1983). In that paper the unconditional distributions for the number of pools and the size Of the rth largest pool were derived. These distributions are useful in examining the unconditional distributions for the number of pools and their pool sizes that could exist in a play. Petroleum resource evaluation methods should not only successively challenge relevant geological concepts but should also reduce uncertainty of the estimates as additional information is given. In the present study, we investigate the distributions of the number of pools and the size of the rth largest pool conditional upon a given play potential and the input data. Conditional analysis is a useful tool which can help enhance the reliability of estimates. An example of a play from a frontier Canadian basin is used to illustrate the method. In this play the method is able to rapidly identify the implications that must underlie various values of potential and provides important guidance as to the validity of some of the input data. THE METHOD
In this section we outline a Monte Carlo algorithm for computing the rth largest pool sizes for a given play potential. It will be apparent later that the conditional distribution of the rth largest pool for a given play potential may involve a large number of multiple integrals; hence strictly numerical computation would not be feasible. However, if the conditional pool size distribution is discrete and the total number of pools in the play is not too large, the quantities of interest for the given play potential can be computed exactly by following the algorithln. Before describing the Monte Carlo Algorithm, we review the framework developed by Lee and Wang (1983). The Framework Let M denote the number of prospects and X* denote the prospect potential of a play. The play risk 0 is defined to be the probability that a prospect contains hydrocarbons, that is, 0 = P(X* > 0). The conditional pool size distribution H(x) is defined by
H(x) =PO(* >xIX* > 0),x > 0 (1) Ttle prospect potential distribution H*(x), is related to the play risk and the conditional pool size distribution as follows
O, t OH(x),
H*(x) =PO(* > x) = I
0 x>0
x
(2)
That is, a prospect has positive probability of not having any potential when 0<1.
Conditional Analysisfor Petroleum Evaluations
351
Following th e superpopulation concept we assume nature deposits prospect potentials X1, * Xz* . . . . . at random according to the prospect potential distribution H*, and does so independently of the selection of the number of prospects, M. Further, we assume the conditional pool size distribution H admits a probability density function h(x), which need not be the lognormal density. Let N denote the number of pools, and ler T be the uncertain quantity representing the potential of the play. Then the distribution of N is given by
P°(N=n)= m~-" =n ( : ) 0n(l - o ) m - n p ( M = m ) '
n=0,1,2 ....
(3)
t>~O
(4)
The play potential distribution, denoted by F(t), is given by
F ( t ) = P ( T > t ) = ~_, Hn(t)Po(N=n), n=l
where Hn(t ) is the greater-than cumulative distribution of a sum of n pool sizes, that is
Hn(t ) =P(X* + " " + X* > t [ X * > 0 . . . . . X* > 0 )
(5)
Let hn(t ) denote the probability density function ofHn(t); then, the absolutely continuous part of F(t) is given by oo
f ( t ) = lim P ( t < T < ~ t + A t ) _ At_~O
At
V" hn(t)Po(N=n),
t>0
(6)
Ft=l
Conditional Distribution of Nurnber of Pools Given that T = t Let Po(N = hit ) denote the conditional probability of n pools given that the total play potential is t. Clearly, Po (N = 0[ 0)= 1 and Po (N >~ 0. For t>0
il0)=
lim P(t < T <~t + At, N = n)
Po(N = n[ t) = [f(t)]-1
~,t~o
At
By assumption
B(t < T <~t + At, N= n) oo
= ~ tn=
P(t
• P(X*>O . . . . , X * > O , N = n , M : m )
n = l, 2 . . . . (7)
352
Lee and Wang
: L
[ f t+~t ] ( m ) [Jt hn(y) dyJ on(1
= Po(N = n) hn(t + t a t ) At,
_ o)m
forsome
-n
P(M = m)
0~<~~<1
Hence it follows that n=0
Po(N=n[t) = (~'
[f(t)]-I hn(t ) PO(N = n),
(s)
n= 1 , 2 , . .
This expression can also be obtained by an appeal to the Bayes Theorem. The expectation and variance o f N given that T = tare oo
Eo(N ]t) = [f(t)]-i ~ nhn(t) Po(N = n),
and
(9)
rt=l oo
~?~(N[t) = [f(t)] -1 ~
In-
Æo(NIt)]~ hn(t)Po(N=n )
(10)
n=l
Conditional Distribution of the rth Largest Pool Given that T = t
I_et X~r) denote the rth largest prospect potential for r = 1,2 . . . . . quantity defined by EPSr(t)=E(X~r)[T=t,X~r)>O),
t>O
The (11)
is the conditional expected size of the rth largest pool given that the total play potential is t. Let EPPr(t) denote the conditional expected rth largest prospect size given that T = t. Then
EPPr(t)=E(X~o[T=t)= ~_, E(X~r)lT=t,N=n)Po(N=nlt )
(12)
n=¥
Note that
E(X~o[T=t,N=n)=O
for
n
EPPr(t) = t
and
(13) (14)
r=l
Now, recall that for any three random variables X, Y, and Z we have
E(X[Z) = E[E(X[ Y, Z)]Z] almost surely. Therefore EPPr(t) = EPSr( t) Po (N >1r[ t)
(15)
Conditional Analysis for Petroleum Evaluations
353
since the rth largest prospect is a pool if and only if there are at least r pools. By combining (12) and (15), it follows that
~o Po(N= n[ t) EPSr(t) = n=rZ E(X~r) [ T = t, N = n) Po (N >~-d[t)
(16)
By a similar argument, the conditional distribution of the rth largest pool given that T = t is given by
P°(N=n[t) (17) Lr(x} t) = P(X~r) > x[ t' X~r> > 0 ) = n=rZ P(X~r) > x] t, n) po(N >Jr[ t) for0
hn(t)] -1 h ( t - x I . . . . .
hn't(x)=
B
Xn_l)j~=i h(xj),
X@Xn, t
=
LO,
(18) otherwise
w h e r e x = ( x l . . . . . Xn_l) and xn,t = (x: x i > O, l < ~ i < ~ n - l a n d x i + ' " + Xn- ! < t). Denote the rth largest pool out of n pools by X(r ). Let S n be the sum X1 +" • • + Xn. Then for n ~> r
P(X~~)>x[t,n)=P(X(~)>x]S n =t),
0
(19)
This probabüity involves a multitude of multiple integrals over various (n - 1) dimensional regions; and it does not appear to have an analytically tractable form in terms of the input distributions as in the unconditional case. For example, the conditional distribution of the largest pool given that T = t and N = 3 pools is of the form
P(X(1)~x[S3-~t)-~ [3ff +3ffR +2 0 ~ x ~ t where
R~ =((x~,x2): 0 < x 2 <<.x
and
x 1 + x 2 /~/" - x}
R 2 = {(Xl,X2): X
and
xl q- x 2 / ~ t - x}
R3 ={(Xl,X2): X < x i , i= 1,2
and
xl + x 2 > ~ t - x }
354
Lee and Wang
For large values of n, this probability will become prohibitively expensive to compute numerically. A Monte Carlo Algorithm
We now provide a Monte Carlo algorithm for estimating the percentiles of Lr(x[ t) and EPSr(t ). Basically, the algorithm generates a sample of specified size, K, from the conditional distribution Lr(x [t). Then the population quantities are estimated by their sample counterparts. A sample value can be generated as follows (i) generate an n (number of pools) from the probability function defined by
. Po(N=nlt) (ii) generate an x from P(X(r)
n = r , r + 1. . . . ,
(20)
> x[Sn = t)
This process is continued until the required number of simulations, K, has been achieved. Of course, in practice, we would not generate the pool sizes one at a time but simultaneously generate a batch of them for each distinct value of n pools. If the convolution density hn(t ) has a mathematically closed form, such as when the conditional pool size density h is a member of the gamma family, then the computation of Qr(n[ t) does not present any difficulty at all. When it does not, various numerical techniques are available for approximating hn(t ). The convolution algorithm we are currently using is an extension of the cubic spline interpolation algorithm that was originally proposed by C16roux and McConalogue (1976). The extended algorithm, developed by McConalogue (1981), allows h(x) to be unbounded at the origin. This means that the extended algorithm can handle extremely skewed conditional pool size distributions, which one frequently encounters in petroleum resource evaluation. We have found empirically that the extended algorithm provides numerically accurate results. Furthermore it handles the lognormal distribution with ease, whereas the method of Fourier inversion does not. This is especially so when the shape parameter a of the lognormal distribution gets large. The major difficulty of the Monte Carlo algorithm is in the generation of rth largest pool sizes for the given number of pools and the fixed play potential. To accomplish this we note that the partial sum S l of l pool sizes X l , X 2 . . . . . X l satisfies the recursive relationship
St=St-1 +Xt,
l = 1,2 . . . . . n
(21)
Conditional Analysis for Petroleum Evaluations
355
Since S 1_ 1 and X l are independent, it is easily seen that P(Xl <~xISl=y)
= [hl(y)] <
hl_l(y-
z)h(z)dz
(22)
for 0 < x < y. Hence, by starting with I = n and S n = t, pool sizes X n , x n - 1, • • •, xl can be recursively generated backward from the conditional probability (22), while updating y after each generation. The resulting sample of n pool sizes will satisfy the constraint that their sum equals the given play potential t. Upon ordering the value of this sample, a realization of the rth largest pool size satisfying the constraint is then obtained. Because of the constraint, in practice, one does not have to generate all of the n pool sizes to get the rth largest pool for the fixed r. For example, when generating the largest pool size satisfying the constraint S n ; t, the process of pool size generation can be stopped as soon as the maximum of the generated values is greater than or equal to the total of the remaining potentials. That is, stop after the ]th generation if max (xn ....
,xn-]÷l)
> t - xn .....
xn-]÷l
where ] = 1, 2 , . . . , n - 1. Similar stopping mechanisms can be devised for the second largest pool size, the third largest, and so on.
A CASE STUDY The Input Data
A Canadian ffontier play was analyzed as an example to illustrate the applications of the conditional analyses described in the previous section. This frontier play is one for which there is limited geological control from exploratory drilling (Lee and Wang, 1983). In the present paper the conditional pool size distribution was approximated by a lognormal distribution (Fig. 1). The play risk was estimated to be 0.065. The distribution of the number of prospects is displayed in Fig. 2. The expected in-place play potential is 4.10 billion barrels of oil (Fig. 3). The Conditional Distribution Po ( N = n I t ) In analyzing this play, one of the questions that arises with the input data unchanged is what is the most likely number of pools that could exist for different values of play potential? The conditional distribution of the number of pools for a given play potential t was computed for values t = 646 × 106, 2.70 × 109, and 7.83 × 10 9 barrels of oft. The results together with the unconditional distribution Po ( N = n ) are displayed in Fig. 4. The following observations can be
356
Lee and Wang 100"
50-
o
o:s
~ lOO
1'.» 260
~
2'.s
3(~0
~
4oo
3'.s 5Öo
109 bbl 600
106m3
Fig. 1. Conditionalpool size distribution. made from this figure (1) The unconditiona] distribution Po(N = n) has a mode at six pools. This means that in the absence of certainty of the potential of this play, six is
the most likely number of pools. (2) Given that the play potential t is known, the model o f P o ( N = n It) increases from 4 to 8 as t increases. (3) As the play potential increases 12-fold, the mode of Po (N = "1 t) only doubles its value. This implies that the individual pool sizes must increase as t increases.
(4) When the play potential is set at 7.83 billion barrels, which is almost twice the expected play potential, the most likely number of pools is 8, which seems small. This is due to the fact that the mode is bounded by the product of the maximum number of prospects, which is 119 in this case, and the
smaU play risk, 0.065. Thus, according to the input data, this play would probably not contain more than eight pools. This type of analysis can provide a mechanism for geologists to examine how the mode of the conditional
A
?= tt.
== O
0
80
90
100
110
120
130
140
150
160
NUMBER OF PROSPECTS
Fig. 2. Distribution of nurnber of prospects.
170
Conditional Analysis for Petroleum Evaluations
357
lOO ^
~>Wu" 50
C~
0
O
0
~ 2
1
3
2'5
i 4
i 5
~
J 6
~
J 7
t 8
1~o
= 9
Gs
t 10 109 bbl
lls lo%3
Fig. 3. Play potential distribution. distribution Po (N = n ]t) varies with play potential. The next section will discuss expected largest pool sizes for given play potentials.
EPSr(t)
The Conditional Expectation
Having computed the play potential distribution, one measure of the total potential of the play is the mean of this distribution. However, the data that we have used might not well represent the play. Based on geological concepts, geologists might choose a value other than the mean from the distribution as a point estimate of the play potential. The question now arises whether the play potential chosen by geologists is compatible with their interpretation and the input data.
P(N=n)
"25 t
.05
.05
251 Ii (D ,¢
P (N=n 12"7x109bbl)
"25 l
p(N=n164 6
.15-
1
.25
P ( N = n [ 7 . 8 x 109bbl)
.15
-
O~ .05-
.05
0
2
4
6
8
10
12
14
16
0
2
4
6
8
10
12
14
16
NO. OF POOLS NO. OF POOLS Fig. 4. Unconditional and conditional distribution of the number of pools for different play potentials.
358
Lee and Wang
Guidelines can be developed to limit the range of potentials from the play potential distribution and to provide feed-back mechanisms which challenge the geological concepts as weil as the input data. The conditional distribution of the size of the rth largest pool for a given potential implies a likelihood of coexistence of a set of play characteristics which can be geologically evaluated and, hence, the reliability of the estimate is improved. Table 1 gives expectations of the first three pool sizes for different play potentials. Table 2 displays the unconditional rth largest pool size distributions. Table 3 displays the conditional pool size distributions for the largest pool sizes for the given play potentials 2.70, 4.10, and 7.83 billion barrels. The information that can be extracted from Tables 1,2, and 3 are summarized below. (1) The largest pool size increases as the play potential increases (Table 1). This is not surprising because the mode of the number of pools is essentially constant for large play potentials. (2) The unconditional largest pool size distribution (Table 2) ranges from 220 (at the 95th percentile) to 8177 (at the 5th percentile) million barrels. In this case, the weighted sum of all pool sizes is 4.10 billion barrels which is the mean of the play potential. On the other hand, if the play potential is given as 4.10 billion barrels, then the conditional distribution of the largest pool size distribution ranges from 1203 (at the 95th percentile) to 3603 (at the 5th percentile) million barrels (Table 3). In other words, the size range Table 1. The Expectations of the First Three Largest Pool Sizes for Different Values from the Play Potential Distribution Play potential (109 barrels) 1.0 2.0 3.0 3.5 4.1 5.0 6.0 6.5 7.0 7.5 8.0 10.0 12.0 30.0
Pool size in 106 barrels Largest 499 1,007 1,550 1,887 2,217 2,815 3,467 3,888 3,923 4,178 4,833 6,087 7,730 24,305
(17)a (36) (54) (69) (74) (90) (115) (125) (130) (131) (153) (191) (231) (482)
Second 236 433 669 718 882 958 1242 1267 1421 1436 1592 1974 2172 3808
(9) (16) (25) (31) (37) (40) (63) (57) (61) (71) (74) (113) (126) (354)
aNumber in parenthesis is the standard error in 106 barrels.
Third 129 247 348 406 428 551 577 612 698 718 733 771 986 1210
(6) (12) (16) (20) (23) (31) (38) (37) (41) (39) (46) (57) (74) (139)
Mode of It)
PO ( N = n
5 6 7 7 7 7 8 8 8 8 8 8 8 8
Conditional Analysis for Petroleum Evaluations
359
Table 2. The Unconditional Distributions of the First Three Largest Pool Sizes Upper percentile (pool size in 106 barrels) The rth pool
Mean
95
90
75
50
25
10
5
1 2 3
2487 769 389
220 84 39
334 134 66
641 265 137
1296 513 273
2666 950 500
5301 1638 832
8177 2272 1120
of the conditional case is much narrower than that of the unconditional case. This implies that additional information reduced uncertainty. (3) In the unconditional case (Table 2), the 5th percentile of the third largest pool size overlaps onto the 22nd percentile o f the second largest pool size, whereas the 5th percentile of the second largest pool size overlaps onto the 30th percentile of the largest pool size. On the other hand, if the play potential is given as 4.10 billion barrels (Table 3), then the 5th percentile o f the third largest pool size overlaps onto the 47th percentile of the second largest pool size distribution, whereas the 5th percentile of the second largest pool size overlaps onto the 87th percentile o f the largest pool size distribution. Thus, the overlapping area in the conditional case is also rauch narrower than that of the unconditional case, hence possesses more discriminatory power. These examples illustrate the fact that the more information we use, the better prediction we will have. Thus conditional analysis can be used as a mechanism to challenge our geological concepts as weil as to improve the reliability o f resource evaluations. Table 3. The Conditional Distributions for Different Play Potentials Given play potential 109 barrels
Upper percentile (pool size in 106 barrels) The rth pool
Mean
95
90
75
50
25
10
5
2.70
1 2 3
1300 625 309
706 258 44
733 315 95
964 462 215
1183 661 307
1569 801 384
2031 912 496
2289 978 575
4.10
1 2 3
2217 882 428
1203 234 73
1293 392 125
1670 647 277
2109 869 414
2727 1074 580
3275 1456 734
3603 1532 878
7.83
1 2 3
4707 1447 740
2309 254 99
2614 442 187
3657 838 440
4458 1443 710
6106 1997 1105
6910 2498 1232
7228 2891 t447
360
Lee and Wang
For a given play potential, the minimum value of the rth largest pool size is known with near certainty. For example, if the play potential is given 7.83 billion barrels, then the largest pool size is at least 2.309 billion barrels (Table 3) with high probability. This information can be used as a guide to determine the maximum play potential that could exist. Based on the geological setting of a play, geologists may be able to determine whether hydrocarbon accumulated in concentrated or dispersed traps. Such interpretations may reläte the largest pool size to the total play potential. Using such information, limits can be placed on the possible play potential range. If the largest pool size or the sum of the first few largest pool sizes do not agree with our geological concepts, then the input data must be reexamined. The conditional distribution of the rth largest pool size given that T = t (see Eq. 17) is unconditional with respect to the number of pools. Thus, once we have chosen an acceptable play potential, we can examine what number of pools would yield a best fit to the rth largest pool sizes. This tool can be applied to a situation for which we have knowledge about the first few largest pool sizes. Take the play potential as 4.10 billion barrels as an example: the largest pool sizes for different n and the number of pools are listed in Table 4. As can be seen, the pool size changes as n changes. If three billion barrels of oll in-place is expected for the largest pool in this play; then the n, ranging from 10 to 13 is not likely acceptable when T = 4.10 billion barrels. Thus far, we have analyzed a play based on conditional analysis and concluded that the output challenges the geological concepts and the input should be reexamined for the following reasons 1. The input data does not support the concept that the play would comprise more than eight pools. Table 4. The Expected Largest Pool Sizes for Different Numbers of Pools When T = 4.10 × 109 Barrels
Number of pools (n)
Largest pool size 109 baxrels
3 4 5 6 7 8 9 10 11 12 13
3.36 3.30 2.43 2.33 2.31 2.00 2.09 1.68 1.52 1.54 1.45
Conditional Analysis for Petroleum Evaluations
361
2. The input suggests that the expected largest pool size amounts to over half of the play potential. As this is not acceptable, then the mean of the pool size is probably too large. SUMMARY The conditional analysis described herein attempts to answer the following questions: for the input data and a given play potential, what is the most likely number of pools and what are their sizes. The implications may lead geologists to reexamine their data and concepts from a different perspective. Furthermore, procedures for resource evaluations should, (1) yield more certain predictions as more geological information is added to the system and, (2) challenge the underlying geological concepts. Conditional analysis can fulfill these requirements and also enhance the reliability of estimates. ACKNOWLEDGMENTS We wish to thank G. M. Kaufman, R. M. Procter, G. C. Taylor, J. A. Wade, K. J. Roy, and N. E. Haimila for comments and discussions on the manuscript. We also would like to thank D. J. McConalogue for providing the FORTRAN program for the generalized convolution algorithm. REFERENCES Cléroux, R, and McConalogue, D. J., 1976, A numerical algorithm for recursively-defined convolution integrals involving distribution functions: Manag. Sci., v. 22, no. 10, p. 1138-1146. Lee, P. J. and Wang, P. C. C., 1983, Probabilistic formulation of a method for the evaluation of petroleum resources: Math. Geol., v. 15, no. 1, p. 163-181. McConalogue, D. J., 1981, Numerical treatment of convolution integrals involving distributions with densities having singularities at the origin: Comm. Stat. Simula Computa., B10(3), p. 265-280.