On Selection of Analog Volcanoes

Estimating the occurrence probability of volcanic eruptions with VEI ≥3 is challenging in several aspects, including data scarcity. A suggested approa...

0 downloads 40 Views 572KB Size

Download PDF

Math Geosci (2011) 43:505–519 DOI 10.1007/s11004-011-9345-6

On Selection of Analog Volcanoes Armando Rodado · Mark Bebbington · Alasdair Noble · Shane Cronin · Gill Jolly

Received: 20 October 2010 / Accepted: 30 April 2011 / Published online: 9 June 2011 © International Association for Mathematical Geosciences 2011

Abstract Estimating the occurrence probability of volcanic eruptions with VEI ≥3 is challenging in several aspects, including data scarcity. A suggested approach has been to use a simple model, where eruptions are assumed to follow a Poisson process, augmenting the data used to estimate the eruption onset rate with that from several analog volcanoes. In this model the eruption onset rate is a random variable that follows a gamma distribution, the parameters of which are estimated by an empirical Bayes analysis. The selection of analog volcanoes is an important step that needs to be explicitly considered in this model, as we show that the analysis is not always feasible due to the required over-dispersion in the resulting negative binomial distribution for the numbers of eruptions. We propose a modification to the method which allows for both over-dispersed and under-dispersed data, and permits analog volcanoes to be chosen on other grounds than mathematical tractability.

A. Rodado IFS–Statistics, Massey University, Palmerston North, New Zealand e-mail: [email protected] M. Bebbington () · S. Cronin Volcanic Risk Solutions, Massey University, Palmerston North, New Zealand e-mail: [email protected] S. Cronin e-mail: [email protected] A. Noble Plant and Food Research Limited, Lincoln, New Zealand e-mail: [email protected] G. Jolly GNS Science, Taupo, New Zealand e-mail: [email protected]

506

Math Geosci (2011) 43:505–519

Keywords Empirical Bayes · Conway–Maxwell–Poisson process · Information gain

1 Introduction With a growing and geographically-expanding human population, volcanic hazard assessment is becoming more prominent as an important tool for assessing volcanic risk and future planning. For land use and evacuation planning, probabilistic hazard models are required, as deterministic warning systems operate only on the shortest of timescales. The simplest stochastic model that can be fitted to volcanic eruption data is the homogeneous Poisson model, characterized by having a constant instantaneous rate of eruption onsets over time (Wickman 1966). However, most volcanoes appear to exhibit more complex behavior. Models which have been developed to allow for this include the nonhomogeneous Poisson process (Ho 1991), where the likelihood of an eruption increases or decreases over time; renewal processes (Bebbington and Lai 1996a), where the likelihood varies as a function of the current repose length; and trend-renewal processes (Bebbington 2010), which combine the two. Although the effect of eruption size on the time to the next event has been studied using the generalized time predictable model (Marzocchi and Zaccarelli 2006) and the history dependent point process (Bebbington 2008), the size of future eruptions has not been the subject of analysis, with a few exceptions such as hidden Markov models (Bebbington 2007), where different volcanic regimes can have varying eruption size distributions. The stochastic models mentioned above model data from a single volcano and, crucially, assume that smaller eruptions provide information about the likelihood of larger eruptions at the same volcano. However, this assumption is debatable: Consider the two Javan stratovolcanoes Semeru and Kelut. In the last 200 years (Siebert and Simkin 2002), the former has had 58 eruptions, of which 56 had VEI 2, and only two exhibited VEI 3. In the same time span, Kelut has had seven eruptions of VEI 2, two of VEI 3, and no less than five of VEI 4. Hence it might be desirable to confine ourselves to large eruption data when estimating the likelihood of large eruptions, but the scarcity of such records from a single volcano can make this problematic. Moreover, small data sets are unlikely to be demonstrably non-Poisson, which is not the same thing as being Poisson, and take no account of the likely observational bias in the estimated rate. At a higher level, it has been shown (De la Cruz-Reyna 1991) that aggregating volcanic records for a given VEI can produce demonstrably Poisson processes. While the aggregation is key here, it does indicate that by considering groups of volcanoes rather than individual volcanoes, our Poisson assumption is likely to be on somewhat firmer ground. This leads to the possibility of using data from analog volcanoes to enhance the information about a target volcano, see Marzocchi et al. (2004). An analog volcano should be one with similar characteristics, detailed discussion of which we will leave for later. However they are defined, developing methods to integrate new data have the potential of not only improving our predictive capacity of future eruptions but also our scientific understanding of volcanic eruption processes.

Math Geosci (2011) 43:505–519

507

Solow (2001) proposed a Poisson composite model where the eruption process at each volcano is a Poisson process and where the Poisson process rates come from a gamma distribution. This implies that the observed number of eruption onsets has a negative binomial distribution. An empirical Bayes methodology can be used to estimate the parameters in the gamma distribution, while also accounting for uncertainty in the observed eruption rates. This was exemplified using six volcanoes from the Mexico-Central American region. However, it is not clear whether or not the methodology is applicable to other groups of reasonable analog volcanoes. We will show that this is not the case, because it is common to find data sets that are not feasible for the model due to under-dispersed data being incompatible with the over-dispersed nature of the negative binomial distribution (Wilson et al. 1986). In addition, we consider the inherently dynamic nature of the method, as new eruptions, or indeed the lack of them, over time alter the parameter estimation. This can result in a feasible set of analog volcanoes becoming infeasible. In this paper, we will propose an alternative procedure that avoids the problem of under-dispersion, while remaining consistent with the result of Solow (2001) in cases where the latter procedure is feasible. The remainder of the paper is organized as follows: In Sect. 2 we will review the empirical Bayes model (Solow 2001), and outline in Sect. 3 how the data set can be augmented to examine, in Sect. 4, the sensitivity of the empirical Bayes methodology. Section 5 introduces a refinement on Solow’s methodology, constructing a modification of the Poisson process that allows for under-dispersed data. The choice of analog volcanoes is discussed from both statistical and geophysical perspectives in Sect. 6.

2 Empirical Bayes Methodology In the empirical Bayes methodology as implemented by Solow (2001), we assume that a particular volcano follows a Poisson process with eruption rate Λ which is a random variable, and that N and T denote the random variables representing the number of eruptions and the length of the observation windows, respectively. Instances of the random variables are represented by lower case letters. Then, the posterior probability density for the eruption rate λ at the target volcano is given by p(λ|n) =

p(n|λ)π(λ) , p(n)

where λ ≥ 0.

The likelihood of the observed number of eruptions is p(n|λ) =

(λt)n exp(−λt) , n!

and it is assumed that the prior is gamma distributed, for example, π(λ) = β λ / (α) λα−1 exp(−βλ), where α, β > 0.

(1)

Then the posterior for λ is also gamma distributed with parameters α = α + n and β = β + t, and the distribution of N is a negative binomial with parameters α and

508

Math Geosci (2011) 43:505–519

γ = β/t p(n) =

α γ (α ) (γ + 1)−n , n!(α) γ + 1

where α = α + n and β = β + t. To implement this model, the parameters α and β need to be estimated. The approach proposed by Solow (2001) was to use a group of m analogous volcanoes that are assumed to have independent eruption rates Λ1 , Λ2 , . . . , Λm with the same probability density function π(λ). This involves estimating the parameters in α m γ (α ) (γ + 1)−ni , p(n1 , n2 , . . . , nm ) = ni !(α) γ + 1

(2)

i=1

either via method of moments, or maximum likelihood. To estimate the parameters α and β of the negative binomial distribution (2) by the method of moments for a pair (n(m) , t (m) ), where n(m) = (n1 , n2 , . . . , nm ) is the data set of eruption counts with observation windows t (m) = (t1 , t2 , . . . , tm ), we define r (m) = (r1 , r2 , . . . , rm ), where ri = ni /ti . The first two moments of Ri are E(Ri ) = α/β and Var(Ri ) = α/(β 2 ) + α/(βti ). By equating the sample and theoretical moments, we get the method of moment estimates of α and β αˆ = R¯ βˆ

(3)

and βˆ =

SR2

R¯ , − R¯ t¯−1

(4)

−1 (m) and S 2 is the sample variance ¯ where t¯−1 = (1/m) m i=1 ti , R is the mean of r R of r (m) . Observe that since Var(Ri ) > E(Ri )ti−1 , we need SR2 − R¯ t¯−1 > 0. We will say that the negative binomial is over-dispersed in this sense, and if SR2 − R¯ t¯−1 < 0, we will say that the data are under-dispersed. As the resulting estimates of α and β are negative for under-dispersed data, the method clearly breaks down for such data. The estimation can also be erratic if SR2 − R¯ t¯−1 is small, and this is even more pronounced in maximum likelihood estimation (Wilson et al. 1986). A bootstrap algorithm proposed by Laird and Louis (1987) can be used to account for observational uncertainty in the parameter estimates due to the short records involved. However, the resulting posterior distribution differs little from the naive estimate, and we will omit the elaboration from the following in the interest of simple exposition.

3 Data In this section, we will reconsider the empirical Bayes analysis example for the Mexico-Central America region performed by Solow (2001). The data given in Table 1 for eruptions (VEI ≥ 3) of the volcanoes Colima, Santa Ana, Izalco, San Miguel,

Math Geosci (2011) 43:505–519

509

Table 1 Extended list of Mexico-Central America volcanoes No

Volcano

Observation period

1

Colima

1576–1981

7

406

2

Santa Ana

1520–1980

4

461

3

Izalco

1770–1980

1

211

4

San Miguel

1586–1980

0

395

5

Momotombo

1550–1980

2

431

6

Atitlan

1469–1980

2

512

7

Popocatepetl

1347–1980

1

634

8

Orizaba

1545-1980

4

436

9

Irazu

1723–1980

2

258

N

T

Momotombo and Irazu are reproduced from Simkin et al. (1981). Although the source has been updated (Siebert and Simkin 2002), we will consider the original data used by Solow (2001). To examine the effect of selecting a given set of analog volcanoes, we extend the list of volcanoes by adding the Mexico-Central American stratovolcanoes Atitlan, Popocatepetl and Orizaba. The observation period is obtained by starting with the first historical date (VEI ≥ 2) and finishes in 1980, to be consistent with Solow (2001). From the list in Table 1, the first eight volcanoes are designated as reference set I = {1, 2, 3, 4, 5, 6, 7, 8}, where 1 = Colima, 2 = Santa Ana, 3 = Izalco, 4 = San Miguel, 5 = Momotombo, 6 = Atitlan, 7 = Popocatepetl, and 8 = Orizaba. We will consider the collection of all 56 subsets of I of size five. Groups of volcanoes will be represented by index sets, sorted in increasing order V1 = {1, 2, 3, 4, 5} V2 = {1, 2, 3, 4, 6} ··· V5 = {1, 2, 3, 5, 6} ··· V56 = {4, 5, 6, 7, 8}. For example, V1 is the original group of volcanoes Colima, Santa Ana, Izalco, San Miguel and Momotombo, studied by Solow (2001). In all the cases, the target volcano will be Irazu. A pertinent question is whether a gamma distribution fits the set of rates from a set of volcanoes. Ignoring for the present the necessity of estimating α and β via (2), it is possible to obtain a naive estimate by simply plugging the observed rates into the ¯ 2 ) and αˆ = (R/s ¯ R )2 . distribution (1) using the method of moment estimates βˆ = R/(s R Goodness of fit to the gamma distribution can then be assessed using the (small sample) methods of Wilding and Mudholkar (2008). This test is based on the fact that the mean x¯ and coefficient of variation c = s/x¯ of a random sample x1 , x2 , . . . , xq ;

510

Math Geosci (2011) 43:505–519

√ Fig. 1 The standardized statistic r(G)/ 3 + 10/α

q ≥ 3, are independent if and only if the population is gamma. The statistic q

− x)(c ¯ −i − c) ˜ , q ¯ 2 i=1 (c−i − c) ˜2 i=1 (x¯ −i − x)

r(G) = q

i=1 (x¯ −i

where the subscript −idenotes the quantity calculated while omitting the ith obserq vation, and c˜ = (1/q) i=1 c−i , has an asymptotically normal distribution with zero mean and variance (3 + 10/α)/q under the null hypothesis of a gamma distribution. Performing this test for the 56 subsets described above, augmenting each subset by Irazu for a sample size of q = 6, we find that they all lie within one standard error (Fig. 1) and hence there is no evidence that a gamma distribution is not an appropriate prior.

4 Sensitivity Analysis The empirical Bayes analysis was implemented by Solow (2001) for Irazu as the target volcano, with V1 as the analog set. The estimates from (3) and (4) were αˆ = 2.29 and βˆ = 324.47, with resulting mean and variance of (0.0074, 0.000015) for the estimates of p(λ|n) based on 1000 bootstrap samples. We will now examine a number of what might be considered sensitivity issues. Firstly, can we use a different set of analog volcanoes and, if so, how much difference does it make in the estimated hazard?

Math Geosci (2011) 43:505–519

511

Table 2 Parameter estimates in the empirical Bayes analysis Subset

αˆ

βˆ

Subset

βˆ

αˆ

V1

2.29

324.47

V29

3.26

436.57

V2

2.02

291.95

V30

2.80

381.65

V3

1.38

214.62

V31

0.86

157.17

V4

3.56

446.36

V32

1.79

256.42

V5

6.33

807.86

V33

1.27

194.74 181.70

V6

3.16

428.84

V34

1.16

V7

56.05

6301.84

V35

2.33

318.88

V8

2.72

375.89

V36

−7.62

−1735.78

V9

18.51

2116.25

V37

22.24

5664.69

V10

5.38

649.70

V38

−19.49

−3579.23

V11

1.75

253.63

V39

11.83

3128.57

V12

1.24

193.07

V40

−88.78

−16753.52

V13

2.88

362.69

V41

5.80

1199.66

V14

1.13

180.27

V42

−4.06

−861.68

V15

2.52

322.98

V43

−3.66

−588.48

V16

1.71

233.69

V44

−6.50

−1128.39

V17

2.27

314.87

V45

−8.32

−1481.53

V18

8.85

1013.62

V46

4.66

1239.62

V19

4.02

486.74

V47

12.00

2273.76

V20

3.41

419.93

V48

3.61

750.10

V21

1.47

240.10

V49

3.00

643.60

V22

1.03

183.08

V50

−31.71

−5667.89

V23

2.36

329.18

V51

−2.59

−871.30

V24

0.94

171.21

V52

−11.21

−2495.60 2583.02

V25

2.07

295.67

V53

10.40

V26

1.42

216.68

V54

7.09

1828.87

V27

1.92

299.29

V55

−4.81

−1001.46

V28

6.64

836.35

V56

3.72

964.94

The estimates of α and β from (3) and (4) for the 56 subsets constructed in Sect. 3 are given in Table 2. The model of Sect. 2 is not always appropriate to handle the data because 25% of the data sets have infeasible estimates (α, β < 0) while using the method of moments. This results from under-dispersion in the observed eruption numbers. It is notable that sets V1 –V35 , i.e., those including Colima, with its higher eruption onset rate, all have feasible parameters. Before suggesting that analog volcanoes should be selected to avoid the problem of under-dispersion, let us conduct a simple thought experiment: Suppose we have a set of analog volcanoes, which we wish to improve on. The ways in which an analog volcano set can be improved are: (a) to make the rate closer to that of the target volcano, or (b) to collect more data. It is easy to see that improvement in the former leads to less dispersion, while improvement in the latter reduces the sampling error,

512

Math Geosci (2011) 43:505–519

Fig. 2 A: Estimated α and β for a set V1 . B: Median and quartiles of the posterior distribution p(λ|n)

i.e., also reduces the dispersion. Hence improvement in the set of analog volcanoes will eventually lead to under-dispersion, and infeasible parameter estimates. A further point of interest is that the method is inherently updating as each new eruption, or each year without an eruption, alters the data set, and hence the resulting estimates. In the original example, the chosen observation windows finished at 1980. Between 1980 and 2010, there was one new eruption of VEI ≥ 3, of Colima in 1997 (Siebert and Simkin 2002). We will use this information to illustrate the dynamic nature of the empirical Bayes procedure. While the estimated α and β are sensitive (Fig. 2A), the variation implicit in the estimated rate (from only two events) for Irazu swamps the variation in the empirical Bayes prior (Fig. 2B). However, of the volcanoes from Table 2 with non-negative estimates of α and β, a new eruption produces an infeasible solution in 16% of cases. Hence, even if we begin with a feasible set of analog volcanoes it may evolve, through new eruptions, into an infeasible set.

5 An Empirical Bayes Analysis Allowing for Under-Dispersion We would like to modify the empirical Bayes procedure in order to avoid the infeasible estimates problem. Under-dispersion in Solow’s method arises from the fact that the Poisson distribution has equal mean and variance, and thus any prior mixing distribution will result in over-dispersion. Hence, we need to replace the Poisson distribution with one that allows for under-dispersion. Shmueli et al. (2005) discuss just such a distribution, due to Conway and Maxwell (1961).

Math Geosci (2011) 43:505–519

513

The Conway–Maxwell–Poisson (CMP) distribution is given (Shmueli et al. 2005) by Pr{N = n} =

ηn (n!)ν Z(η, ν)

,

(5)

where ∞ ηk , Z(η, ν) = (k!)ν

(6)

k=0

for ν ≥ 0. Values of ν < 1 produce over-dispersion, ν > 1 results in under-dispersion, and ν = 1 is the standard Poisson distribution. Unlike the Poisson distribution, there is no direct interpretation as a stochastic process, and so we will need to derive one. First, let us replace η in (5) and (6) with μt, to obtain

Pn (t) = Pr N (t) = n =

(μt)n , (n!)ν Z(μt, ν)

(7)

and Z(μt, ν) =

∞ (μt)k k=0

(k!)ν

.

Differentiating (μt)n μY (μt, ν) ∂ μn(μt)n−1 Pn (t) = − , ∂t (n!)ν Z(μt, ν) Z 2 (μt, ν)

(8)

where Y (μt, ν) =

∞ (μt)k (k + 1)1−ν , (k!)ν k=0

Kolmogorov’s forward equation is then ∂ Pn (t) = Pn−1 (t)λn−1 (t) − Pn (t)λn (t) ∂t

(9)

with λn (t) the instantaneous occurrence rate at time t, conditional on N (t) = n. Substituting (7) and (8) into (9) and simplifying, we obtain the relationship λn (t) =

Y (μt, ν) nν n λn−1 (t) − + μ , μt t Z(μt, ν)

(10)

which can be solved recursively, starting with λ0 (t) = μY (μt, ν)/Z(μt, ν). The conjugate prior distribution for the CMP distribution is πt (μ, ν) =

(μt)a−1 e−νb κt (a, b, c) Z c (μt, ν)

(11)

514

Math Geosci (2011) 43:505–519

(Kadane et al. 2006) where κt (a, b, c) = 0

∞ ∞ 0

(μt)a−1 e−νb dμ dν Z c (μt, ν)

−1 .

(12)

The double integral in (12) has to be evaluated numerically. The posterior density for a single observation n is then of the same form as (11), with parameters a = a +n, b = b +log(n!) and c = c +1. The predictive probability function for a sample n(m) = (n1 , n2 , . . . , nm ) is (Kadane et al. 2006) m Pr N (m) t (m) = n(m) a, b, c = i=1

κti (a, b, c) . κti (a + ni , b + log(ni !), c + 1)

(13)

The empirical Bayes analysis then proceeds by fitting the analog data to the predictive distribution (13), maximizing over the parameters a, b and c, with the result (aˆ = 1.00, bˆ = 4.72, cˆ = 1.27) for subset V1 shown in Fig. 3A. Using Irazu with n = 2 in t = 258 years, we obtain the posterior distribution with a = aˆ + 2, b = bˆ + log(2) and c = cˆ + 1 (Fig. 3B). The prior and posterior eruption onset rate distributions are estimated by taking 10,000 random samples of (μ, ν) from the respective distributions, calculating λ2 (258) for each sample from (10), and forming a kernel density estimate of the results, which can be compared (Fig. 4) with the results of Solow (2001). The new prior is more dispersed than the original, but the posterior estimates evolve in a similar manner. For comparison, the prior (aˆ = 3.38, bˆ = 3.03, cˆ = 1.44) and posterior distributions for the under-dispersed subset V38 , which differs from V1 only in replacing Colima by Orizaba, are less concentrated at small values of μ and ν (Fig. 3, C and D). The resulting prior and posterior eruption onset rate distributions for Izalco are considerably closer (Fig. 5) than those using analog set V1 .

6 Discussion One way of evaluating the effectiveness of an updating procedure such as empirical Bayes is to consider the information gain (cf. Bebbington 2005) in moving from a prior to a posterior distribution. This is equivalent to the Kullback–Leibler divergence

∞ p(μ|n) DKL = dμ. p(μ|n) log π(μ) 0 Calculating this using the Conway–Maxwell model for our 56 subsets (Fig. 6) shows that the greatest information gains come from over-dispersed analog sets, i.e., relatively uninformative priors. Thus the measurement is one of improvement, rather than value. Low values for the Kullback–Leibler divergence mean that the addition of the object volcano provides little additional information. Thus the behavior of the analog set is similar to that of the object volcano, which is a reasonable definition of a good analog set. In this example, over-dispersion results from the inclusion of Colima with its high onset rate. An under-dispersed set with onset rates dissimilar to that of the

Math Geosci (2011) 43:505–519

515

Fig. 3 Parameter estimates a, ˆ bˆ and cˆ in the conjugate distribution (11). A: Prior distribution for a set V1 . B: Posterior distribution, adding Irazu, for a set V1 . C: Prior distribution for a set V38 . D: Posterior distribution, adding Irazu, for a set V38 . Contours are at intervals of 20

object volcano would of course produce a high information gain. In other words, both precision (under-dispersion) and accuracy (unbiasedness) are desirable in an analog set. A question for further investigation is the most appropriate size for an analog set. The preceding has been a mathematical treatment of the question of how to use analog volcanoes to add strength to an estimate of eruption likelihood. As our modification of Solow’s idea removes the need to select an over-dispersed set of analog volcanoes, they can instead be chosen on volcanological, rather than mathematical, grounds. Selecting sets of analog volcanoes from geological/volcanological criteria is not a simple matter. Volcanoes are typically classified from broad, summary properties, such as: magma composition (three main categories), tectonic setting (around five broad types) and morphology (at least four main forms). These general characteristics cannot be regarded as independent variables, but they are also not directly related to one another in unique or universally predictable ways. In addition, within any one combination of broad characteristics, such as the selection of volcanoes described in this paper (andesitic magma composition, subducting convergent plate boundary, stratovolcano/composite volcano), there is also a huge variability in volcanospecific settings (such as crustal thickness, crustal lithology, mantle composition,

516

Math Geosci (2011) 43:505–519

Fig. 4 Prior and posterior estimates of density for the current onset rate of Irazu using the analog set V1

Fig. 5 Prior and posterior estimates of density for the current onset rate of Irazu using the analog set V38

distance/depth to subducted slab), compositions (viscosity, water content, mineralogy, plumbing/storage systems) and landforms (edifice height, lake/glacier cover).

Math Geosci (2011) 43:505–519

517

Fig. 6 KL-divergence values for the Conway–Maxwell model

A combination of all of these features collectively influence eruption rate, size and style. Controls on eruption onset rate in nominally similar volcanoes also appear strongly related to magma generation/storage/rise processes and their relationship to local earthquake activity and stress regimes. For example, the neighboring stratovolcanoes of Ruapehu and Ngauruhoe, located only 20 km apart and with very minor differences in magma composition, have very different recent eruption histories (Bebbington and Lai 1996b) that probably reflect their contrasting sub-volcanic plumbing systems and the different pathways and storage regions for magmas prior to eruption (Price et al. 2005, 2010). In addition, another nearby New Zealand stratovolcano, Mt. Taranaki, shows evidence for a completely different magma storage process that leads to a different pattern of strongly time-variable eruption frequency (Price et al. 2005; Turner et al. 2008). Further, some volcanoes show strongly time-varying eruption styles, which influence eruption size; for example, Soufriere Hills, Montserrat predominately erupts small-volume andesites as dome-building or small-volume vulcanian explosions, although for a period, it erupted basaltic andesite lava flows (Harford et al. 2003). Despite these difficulties, defining and testing a subset of geological properties may prove useful for assembling analog volcano sets over which comparison of eruption rates is most valid. 7 Conclusions There are so little data available for large eruptions at a single volcano that some forms of Bayesian model to gather additional information from analog volcanoes can

518

Math Geosci (2011) 43:505–519

be of great benefit to improving volcanic eruption forecasts globally. The original formulation of an empirical Bayes model for eruption onset rate (Solow 2001) based on mean onset rates generates a negative binomial distribution for the eruption counts. Thus the model inherits the infeasibility of that distribution when faced with underdispersed data. Better analog sets are less dispersed, and the inherent updating in the method can reduce dispersal, both of which can lead to infeasible estimates. We have formulated an alternative model in the spirit of the original, using a variant of the Poisson distribution that accommodates under-dispersed data. This clears the way for analog volcanoes to be chosen on geological grounds, rather than to satisfy the limitations of the mathematical model. Acknowledgements This work was supported by the Earthquake Commission of New Zealand and GNS Science. We thank Kate Arentsen and two anonymous reviewers for comments on an earlier version.

References Bebbington MS (2005) Information gains for stress release models. Pure Appl Geophys 162:2299–2319 Bebbington MS (2007) Identifying volcanic regimes using hidden Markov models. Geophys J Int 171:921– 942 Bebbington MS (2008) Incorporating the eruptive history in a stochastic model for volcanic eruptions. J Volcanol Geotherm Res 175:325–333 Bebbington MS (2010) Trends and clustering in the onsets of volcanic eruptions. J Geophys Res 115:B01203. doi:10.1029/2009JB006581 Bebbington MS, Lai CD (1996a) On nonhomogeneous models for volcanic eruptions. Math Geol 28:585– 600 Bebbington MS, Lai CD (1996b) Statistical analysis of New Zealand volcanic occurrence data. J Volcanol Geotherm Res 74:101–110 Conway RW, Maxwell WI (1961) A queueing model with state dependent service rate. J Ind Eng 12:132– 136 De la Cruz-Reyna S (1991) Poisson-distributed patterns of explosive eruptive activity. Bull Volcanol 54:57–67 Harford CL, Sparks RSJ, Fallick AE (2003) Degassing at the Soufriere Hills Volcano, Montserrat, recorded in matrix glass compositions. J Pet 44:1503–1523 Ho CH (1991) Nonhomogeneous Poisson model for volcanic eruptions. Math Geol 23:167–173 Kadane JB, Shmueli G, Minka TP, Borle S, Boatwright P (2006) Conjugate analysis of the Conway– Maxwell–Poisson distribution. Bayesian Anal 1:363–374 Laird NM, Louis TA (1987) Empirical Bayes confidence intervals based on bootstrap samples. J Am Stat Assoc 82:739–757 Marzocchi W, Zaccarelli L (2006) A quantitative model for the time-size distribution of eruptions. J Geophys Res 111:B04204. doi:10.1029/2005JB003709 Marzocchi W, Sandri L, Gasparini P, Newhall C, Boschi E (2004) Quantifying probabilities of volcanic events: the example of volcanic hazard at Mount Vesuvius. J Geophys Res 109:B11201. doi:10.1029/2004JB003155 Price RC, Gamble JA, Smith IEM, Stewart RB, Eggins S, Wright IC (2005) An integrated model for the temporal evolution of andesites and rhyolites and crustal development in New Zealand’s North Island. J Volcanol Geotherm Res 140:1–24 Price RC, Turner S, Cook C, Hobden B, Smith IEM, Gamble JA, Handley H, Maas R, Mobis A (2010) Crustal and mantle influences and U-Th-Ra disequilibrium in andesitic lavas of Ngauruhoe volcano, New Zealand. Chem Geol 277:355–373. doi:10.1016/j.chemgeo.2010.08.021 Siebert L, Simkin T (2002) Volcanoes of the world: an illustrated catalog of holocene volcanoes and their eruptions. Global Volcanism Program Digital Information Series, GVP-3. Smithsonian Institution, http://www.volcano.si.edu/world/. Accessed 15 March 2010 Simkin T, Seibert L, McClelland L, Bridge D, Newhall C, Latter JH (1981) Volcanoes of the world. Hutchisson Ross, Stroudsburg, 232

Math Geosci (2011) 43:505–519

519

Solow AR (2001) An empirical Bayes analysis of volcanic eruptions. Math Geol 33:95–102 Shmueli G, Minka TP, Kadane JB, Borle S, Boatwright P (2005) A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution. Appl Stat 54:127–142 Turner M, Cronin S, Smith I, Bebbington M, Stewart RB (2008) Using titanomagnetite textures to elucidate volcanic eruption histories. Geology 36:31–34 Wickman FE (1966) Repose-period patterns of volcanoes. Ark Mineral Geol 4:291–367 Wilding GE, Mudholkar GS (2008) A gamma goodness-of-fit test based on characteristic independence of the mean and coefficient of variation. J Stat Plan Inference 138:3813–3821 Wilson LJ, Folks JL, Young JH (1986) Complete sufficiency and maximum likelihood estimation for the two-parameter negative binomial distribution. Metrika 33:349–362

On Selection of Analog Volcanoes

Recommend Documents