Stat Papers https://doi.org/10.1007/s00362-018-1010-4 REGULAR ARTICLE
Uncertainty quantification: a minimun variance unbiased (joint) estimator of the non-normalized Sobol’ indices Matieyendou Lamboni1,2
Received: 2 March 2017 / Revised: 9 May 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018
Abstract Often, uncertainty quantification is followed by the computation of sensitivity indices of input factors. Variance-based sensitivity analysis and multivariate sensitivity analysis (MSA) aim to apportion the variability of the model output(s) into input factors and their interactions. Sobol’ indices (first-order and total indices), which quantify the effects of input factor(s), serve as a practical tool to assess interactions among input factors, the order of interactions, and the magnitude of interactions. In this paper, we investigate a novel way of estimating both the first-order and total indices based on U-statistics, including the statistical properties of the new estimator. First, we provide a minimum variance unbiased estimator of the non-normalized Sobol’ indices as well as its optimal rate of convergence and its asymptotic distribution. Second, we derive a joint estimator of Sobol’ indices, its consistency and its asymptotic distribution, and third, we demonstrate the applicability of these results by means of numerical tests. The new estimator allows for improving the estimation of Sobol’ indices for some degrees of the kernel. Keywords First-order index and total indices · MVU estimators · Sensitivity and uncertainty analysis · U-statistics
1 Introduction Comprehensive and numerical models are widely used as experimental tools for supporting decision making. They often include numerous uncertain factors as inputs.
B
Matieyendou Lamboni
[email protected];
[email protected]
1
Department DFRST, University of Guyane, 97346 Cayenne, French Guiana, France
2
228-UMR Espace-Dev, University of Guyane, University of Réunion, University of Montpellier, IRD, Cayenne, French Guiana, France
123
M. Lamboni
These uncertainties can strongly affect the model output(s) and pose difficulties when building some scenarios. A small number of input factors really act in the model (Saltelli et al. 2000), and this number refers to the notion of the effective dimension of a model (Caflisch et al. 1997; Kucherenko et al. 2011) or its alternative, the mean dimension (Owen 2013b), which is a useful way to deal with the curse of dimensionality in practice. For complex systems or models, uncertainty quantification is often followed by sensitivity analysis. Variance-based sensitivity analysis methods (Sobol 1993; Saltelli et al. 2000; Ghanem et al. 2017) and multivariate sensitivity analysis (MSA) methods (Lamboni et al. 2008, 2009, 2011; Gamboa et al. 2014) aim at apportioning the variance of the model output and the inertia of the model outputs (respectively) into input factors and their interactions. These methods are useful ways to deal with the curse of dimensionality for the single-response and multivariate-response models by assessing interactions among input factors, the order of interactions, and the magnitude of interactions. For complex systems or models that are very expensive in terms of model evaluations, a meta-modeling or emulator approach based on a Gaussian process, or a kriging or a high dimension model representation (HDMR), or polynomial chaos expansions (Storlie et al. 2009; Muehlenstaedt et al. 2012; Antoniadis and Pasanisi 2012; Marrel et al. 2012; Oakley and O’Hagan 2004; Conti and O’Hagan 2010; Jourdan 2012; Buzzard 2012; Ratto and Pagano 2010; Ratto et al. 2007; Sudret 2008) is an attractive way to compute Sobol’ indices or the generalized sensitivity indices from MSA. Other sensitivity methods, including non-parametric methods, allow for identifying the most influential inputs by computing only one index. Among these methods, we distinguish the density-based approaches (Plischke et al. 2013; Bolado-Lavin et al. 2009; Borgonovo et al. 2014) and the derivative-based methods (Lamboni et al. 2013; Sobol and Kucherenko 2009; Kucherenko et al. 2009), which are computationally more attractive. In variance-based sensitivity analysis and MSA, while the first-order index is used to rank input factors according to their single contribution to the model variability, the total index allows for ranking the input factors according to their overall contribution to the model variability. The computation of Sobol’ indices, using sample-based methods, has been widely investigated (Sobol 1993, 2001; Chan et al. 2000; Fang et al. 2003; Mara and Joseph 2008; Saltelli 2002; Saltelli et al. 1999, 2010; Sobol 2007; Owen 2013a, b; Plischke et al. 2013; Borgonovo et al. 2014; Ghanem et al. 2017; Kucherenko et al. 2015; Lamboni 2016a, b). Recently, it is known in Borgonovo et al. (2014) that the classical estimators of Sobol’ indices (Saltelli et al. 2010) fail to converge in the presence of skewed or heavytailed distributions of input factors, and/or important interactions among input factors, and/or high variances of the model output(s). The authors used a log transformation to increase the convergence of estimations toward the (transformed) values. In the same sense, Lamboni (2016a) provided the theoretical guarantees of the consistency of a generalized estimator of the total index, including the classical estimator of Jansen (Jansen 1999). The new generalized estimator, based on U-statistics, improves the estimation of total indices.
123
Uncertainty quantification: a minimun variance unbiased...
In this paper, we propose a novel estimator of Sobol’ indices and some theoretical guarantees of this estimator. The new estimator relies on U-statistics of two samples, and it makes use of a kernel of degree ( p, q). First, we provide a minimum variance unbiased estimator (MVUE) of the non-normalized first-order index as well as its consistency, its asymptotic distribution, and its optimal rate of convergence. Second, we provide a joint MVU estimator of the non-normalized Sobol’ indices, including its consistency, its asymptotic distribution, and its optimal rate of convergence. Finally, Based on these results, we derive an efficient (joint) estimator of Sobol’ indices. The paper is organized as follows: Sect. 2 recalls some useful definitions of Sobol’ indices, while Sect. 3 focuses on giving the new estimators of the non-normalized Sobol’ indices and Sobol’ indices and the main statistical properties of these estimators. In Sect. 4, we use a proxy measure for choosing the degree of the kernel ( p, q), and in Sect. 5, we demonstrate the applicability of these results by means of numerical tests. We made some simulations to compare estimators of Sobol’ indices. Section 6 concludes this work.
Notation Throughout the paper, f (X) denotes a single-response model with d random input factors X = (X 1 , . . . , X d ). We use u as a non-empty subset of {1, 2, . . . , d}, and we use |u| for its cardinality (i.e., the number of elements in u). Let Xu = {X j , j ∈ u} be a set of input factors, and X∼u denote the vector containing all input factors except Xu . We have the following partition: X = (Xu , X∼u ). We use μ(X) = μ(X 1 , . . . , X d ) as the joint probability measure or distribution of input factors. We assume often that input factors are independent (Assumption A1), and the model output has a finite second moment, that is, V(·) for the E f 2 (X) < +∞ (Assumption A2). We use E(·) for the expectation, variance, Tr (·) for the trace, Cov (·) for the covariance, and np for the number of D
P
→ and − → for the combinations for selecting p elements out of n. Finally, we use − convergence in distribution and in probability respectively.
2 Definition of Sobol’ indices
Let X, X be two independent vectors from the joint probability measure μ(X). Under Assumption A2, we define Du as follows: Du = Cov f (Xu , X∼u ) , f Xu , X∼u .
(2.1)
If Assumption A1 holds then Eq. (2.1) may be rewritten as follows: Du = V [E ( f (X)|Xu )] .
(2.2)
By Eq. (2.2), it comes out that Du is the non-normalized first-order index of Xu [well-known in sensitivity analysis (SA)]. A similar definition is used in Owen (2013b)
123
M. Lamboni
and Lamboni (2016a) to obtain the first-order index for a given function or kernel. Thus, Eq. (2.1) generalizes the definition of the first-order index to cope with dependent input factors.
Let D be the variance of the model output and Dutot = v⊆{1,2,...,d} Dv . The normalized first-order index of Xu (Sobol 1993) is given by Su =
v∩u =∅
Du . D
(2.3)
Further, the total index of Xu is defined as follows: STu =
Dutot . D
(2.4)
Remark 1 The inequality Su ≤ STu is only guaranteed under Assumption A1, in which case the covariance Du ≥ 0. When input factors are dependent, Eq. (2.1) is still valuable for assessing the first-order effects. In the same sense, Sobol’ indices have been generalized to cope with dependent inputs in Mara and Tarantola (2012), Kucherenko et al. (2012) and Mara et al. (2015). The authors distinguished four types of Sobol’ indices (see Mara et al. 2015 for comprehensive details).
3 Generalized estimator of the first-order and total indices In this section, we provide the estimators of the non-normalized indices (i.e., T T Du , Dutot ) and the Sobol indices (i.e., Su , STu ), including the statistical qualities of these estimators. The estimators rely on U-statistics of two samples. The theory of U-statistics allows for easily deriving the main properties of an estimator. However, It requires, first, to find a kernel function, which expectation is exactly the parameter we want to estimate (e.g., the non-normalized indices), and second, to propose a MVU estimator of the non-normalized indices based on this kernel (see an useful summary about the theory of U-statistics in Appendix A). This section follows these two steps to derive an efficient estimator of the Sobol indices based on the MVU estimator of the non-normalized indices. 3.1 A kernel function for the first-order index For the estimation of the non-normalized first-order index, we define the following kernel. ( p) (q) be p i.i.d copies of Xu and X(1) Definition 1 Let X(1) u , . . . , Xu ∼u , . . . , X∼u be q i.i.d copies p ≥ 2, q ≥ 2. We consider a function with two types of X∼u for integers ( p) (q) (1) (1) and X∼u , . . . , X∼u . This function is called a kernel of of inputs Xu , . . . , Xu degree ( p, q) in the theory of U-statistics, and it has q + p arguments. We define the kernel K (·) as follows:
123
Uncertainty quantification: a minimun variance unbiased...
( p) (q) (1) K X(1) u , . . . , Xu , X∼u , . . . , X∼u := ⎛
q−1 q p 2 p 2 ( p − 1)q(q − 1) k=1 l=1 i=l+1 ⎞
p ⎟ ⎜ ( j1 ) (l) (l) ⎟ f (X(k) ×⎜ u , X∼u ) − f (Xu , X∼u ) ⎠ ⎝
⎛ ⎜ ×⎜ ⎝
j1 =1 j1 =k p
⎞
j2 =1 j2 =k
⎟ ( j2 ) (i) (i) ⎟ f (X(k) u , X∼u ) − f (Xu , X∼u ) ⎠ .
(3.1)
( p) (q) (1) (1) The kernel K Xu , . . . , Xu , X∼u , . . . , X∼u is symmetric under indepen( p)
dent permutations of its first arguments (X(1) u , . . . , Xu ) and second arguments (q) , . . . , X ). Indeed, the kernel value does not change if we permute the posi(X(1) ∼u ∼u (j ) (j ) (i 1 ) (i 2 ) tion of Xu and Xu in one hand, and the position of X∼u1 and X∼u2 in the other hand, with i 1 , i 2 ∈ {1, 2, . . . , p} and j1 , j2 ∈ {1, 2, . . . , q}. As the input factors of the kernel are random variables, we can derive the expectation of the kernel given these inputs. Theorem 1 provides this result. Theorem 1 If Assumptions A1 and A2 (E f 2 (X) < +∞) hold then we have ( p) (q) (1) = Du . E K X(1) u , . . . , Xu , X∼u , . . . , X∼u
(3.2)
Proof Without loss of generality, we assume that the model output is centered, that is, ( j) (i) E [ f (X)] = 0. Bearing in mind that Xu and X∼u are independent for i = 1, 2, . . . , p, (j ) (j ) (i) (i) j = 1, 2, . . . , q, and using the fact that Du = Cov f (Xu , X∼u1 ), f (Xu , X∼u2 ) , with j1 = j2 (see Eq. (2.1)), we can write p 2 ( p − 1)q(q − 1) (1) ( p) (q) E K Xu , . . . , Xu , X(1) , . . . , X ∼u ∼u 2 ⎡⎛ =
q−1 q p k=1 l=1 i>l
⎢⎜ (k) (l) ⎜ E⎢ ⎣⎝( p − 1) f (Xu , X∼u ) −
⎛
p j1 =1 j1 =k
⎞
⎟ (j ) ⎟ f (Xu 1 , X(l) ∼u )⎠ ⎞⎤
p ⎟⎥ ⎜ (j ) (k) (i) ⎟⎥ , X ) − f (Xu 2 , X(i) ×⎜ ( p − 1) f (X u ∼u ∼u )⎠⎦ ⎝
⎡ =
p,q−1,q k=1,l=1,i>l
j2 =1 j2 =k
⎢ 2 (k) (l) (k) (i) E⎢ ⎣( p − 1) f (Xu , X∼u ) f (Xu , X∼u )
123
M. Lamboni
⎤ −( p − 1)
p j2 =1 j2 =k
+
p,q−1,q k=1,l=1,i>l
+
p
p
j1 =1 j2 =1 j1 =k j2 =k
⎥ (j ) (k) (l) ⎥ f (Xu 2 , X(i) ∼u ) f (Xu , X∼u )⎦ ⎡
p ⎢ (j ) (k) (i) E⎢ (1 − p) f (Xu 1 , X(l) ∼u ) f (Xu , X∼u ) ⎣ j1 =1 j1 =k
⎤
⎥ (j ) ( j2 ) (i) ⎥ f (Xu 1 , X(l) ∼u ) f (Xu , X∼u )⎦ .
(j )
(i)
(k)
(l)
By using the independence between f (Xu 2 , X∼u ) and f (Xu , X∼u ) for k = j2 and i = l, we have (j ) ( j2 ) (k) (l) (i) (k) (l) E f (Xu 2 , X(i) ∼u ) f (Xu , X∼u ) = E f (Xu , X∼u ) E f (Xu , X∼u ) = 0, as the model output is centered. Likewise, we have (j ) (k) (i) E f (Xu 1 , X(l) ) f (X , X ) =0 ∼u u ∼u
if j1 = k and l = i.
By using the linearity of the expectation, we have p 2 ( p − 1)q(q − 1) (1) ( p) (q) E K Xu , . . . , Xu , X(1) ∼u , . . . , X∼u 2 p,q−1,q (l) (k) (i) ( p − 1)2 E f (X(k) = u , X∼u ) f (Xu , X∼u ) k=1,l=1,i>l
+
p, p j1 =1, j2 =1 j1 =k, j2 =k
⎤ ⎥ (j ) ( j2 ) (i) ⎥ E f (Xu 1 , X(l) ∼u ) f (Xu , X∼u ) ⎦ .
(k) (l) (k) (i) (k) (l) (k) (i) Further, as Cov f (Xu , X∼u ), f (Xu , X∼u ) = E f (Xu , X∼u ) f (Xu , X∼u ) = Du , we have p 2 ( p − 1)q(q − 1) (1) ( p) (q) , . . . , X E K Xu , . . . , Xu , X(1) ∼u ∼u 2
123
Uncertainty quantification: a minimun variance unbiased...
⎡
⎤
⎢ ⎥ p p ⎢ ⎥ ⎢ ( j1 ) ( j2 ) 2 (l) (i) ⎥ = E f (Xu , X∼u ) f (Xu , X∼u ) ⎥ ⎢( p − 1) Du + ⎢ ⎥ j1 =1 j2 =1 k=1,l=1,i>l ⎣ ⎦ p,q−1,q
j1 =k j2 = j1 j2 =k
⎡ =
p,q−1,q k=1,l=1,i>l
⎢ ⎢( p − 1)2 Du + ⎣ ⎡
=
p,q−1,q k=1,l=1,i>l
=
p j1 =1 j1 =k
⎤ ⎥ (j ) ( j1 ) (i) ⎥ E f (Xu 1 , X(l) ∼u ) f (Xu , X∼u ) ⎦ ⎤
p ⎢ ⎥ ⎢( p − 1)2 Du + Du ⎥ ⎣ ⎦ j1 =1 j1 =k
( p − 1)2 Du + ( p − 1)Du
p,q−1,q k=1,l=1,i>l
= =
p,q−1,q
p( p − 1)Du
k=1,l=1,i>l p 2 ( p − 1)q(q
2
− 1)
Du .
Theorem 1 shows that the kernel K (·) is an unbiased estimator of the covariance Du (the non-normalized first-order index of Xu ). The kernel K (·) is symmetric in its first and second arguments respectively, and it allows for using the U-statistics theory of two samples to obtain the statistical properties of an estimator based on the kernel K (·) (see Appendix A; Lehmann 1951; Ferguson 1996; Lehmann 1999; Hoeffding 1948a, b for comprehensive details). Remark 2 If p = q = 2, the expression of the kernel K (·) becomes 1 (2) (1) (2) (1) (1) (2) (1) f (X = K X(1) , X , X , X , X ) − f (X , X ) u u ∼u ∼u u ∼u u ∼u 2 (2) (2) (2) , X ) − f (X , X ) × f (X(1) u ∼u u ∼u .
(3.3)
3.2 Generalized estimator of the first-order index Using the unbiased kernel from Theorem 1 and the U-statistics theory, corollaries will give the minimum variance (best) unbiased estimator of the non-normalized firstorder index, its variance, and its asymptotic distribution for a given degree ( p, q) of the kernel.
123
M. Lamboni
( p) and q i.i.d copies of X∼u Consider p i.i.d copies of Xu X(1) u , . . . , Xu (q) (1) X∼u , . . . , X∼u , with p, q ≥ 2. For l ∈ {0, 1, . . . , p} and k ∈ {0, 1, . . . , q}, we 2 as follows: define σl,k
( p) (q) 2 (1) (1) (l) (1) (k) . = V E K X(1) σl,k u , . . . , Xu , X∼u , . . . , X∼u |Xu , . . . , Xu , X∼u , . . . , X∼u (1)
(l)
2 is the (non-normalized) first-order index of X , . . . , X , Regarding SA, σl,k u u
(1)
(k)
2 is the variance of the kernel K (·), σ 2 and σ 2 are X∼u , . . . , X∼u . In particular, σ p,q 1,0 0,1 (1)
(1)
the first-order indices of Xu and X∼u respectively. Now, we can state our main results. Corollary 1 provides a MVU estimator of the non-normalized first-order index. (1) (n ) (1) Corollary 1 Let X = Xu , . . . , Xu 1 be n 1 i.i.d copies of Xu , Y = X∼u , . . . , (n ) (1) (n ) (1) (n 2 ) , X∼u2 be n 2 i.i.d copies of X∼u , Xi = Xi,u , . . . , Xi,u1 and Yi = Xi,∼u , . . . , Xi,∼u i = 1, 2, . . . , m, be two independent samples of size m from X and Y respectively. If Assumptions A1, A3 (E f 4 (X) < +∞), and A5 (2 ≤ p ≤ n 1 , 2 ≤ q ≤ n 2 ) hold then we have (i) the minimum variance unbiased estimator of Du for a given ( p, q), n 1 , n 2 , and m is given by u = D
m 2 n 1 n 2 mp 2 ( p − 1)q(q − 1) p q i=1 Π ∈C p
p,n 1
Πq ∈Cq,n 2 k∈Π p
⎞ ⎟ ⎜ (j ) (k) (l) (l) f (Xi,u ×⎜ , Xi,∼u ) − f (Xi,u1 , Xi,∼u ) ⎟ ⎠ ⎝ ⎛
⎛
j1 ∈Π p j1 =k
l∈Πq i 1 ∈Πq l =max(Π p ) i 1 >l
⎞
⎟ ⎜ ( j2 ) (k) (i 1 ) (i 1 ) ⎟ f (X ×⎜ , X ) − f (X , X ) i,u i,∼u i,∼u ⎠ , i,u ⎝
(3.4)
j2 ∈Π p j2 =k
where the sum Π p ∈ C p,n 1 (resp. Πq ∈ Cq,n 2 ) is taken over all combinations of p (resp. q) elements chosen among {1, . . . , n 1 } (resp. {1, . . . , n 2 }); u is given by (ii) the variance of D u ) = V( D
q p i=1 j=1
123
pn 1 − pq n 2 −q i
p−i j q− j n 1 n 2 m p q
σi,2 j ;
(3.5)
Uncertainty quantification: a minimun variance unbiased...
(iii) if n 1 → +∞, n 2 → +∞, m → +∞, and n 1 m/N → α, with N = (n 1 + n 2 )m and α ∈]0, 1[, we have the asymptotic normality, that is, √
D p2 2 q2 2 u − Du − . σ0,1 N D → N 0, σ1,0 + α 1−α
(3.6)
Proof The proof is obtained by directly applying the theorems from the U-statistics theory to the kernel K (·) in Eq. (3.1), which is of degree p with respect to its first argument and q with respect to its second argument. These theorems can be found in Lehmann (1999), Ferguson (1996), Hoeffding (1948a, b), and Sugiura (1965). To u is an average (over i = obtain the final result, we can see that the estimator D 1, 2, . . . , m) of the U-statistic corresponding to K (·). See Appendix A for an useful summary about U-statistics and their statistical properties.
u is optimal within the class of unbiased estimators of Du Remark 3 The estimator D that are based on kernels of degree ( p, q) and make use of the information (Xi , Yi ), i = 1, 2, . . . , m. Corollary 2 The case where n 1 = p and n 2 = q. ( p) (q) (1) Let X = Xu , . . . , Xu be p i.i.d copies of Xu , Y = X(1) ∼u , . . . , X∼u be q ( p) (q) (1) (1) i.i.d copies of X∼u , Xi = Xi,u , . . . , Xi,u and Yi = Xi,∼u , . . . , Xi,∼u , i = 1, 2, . . . , m, be two independent samples of size m from X and Y respectively. If Assumptions A1, A3 (E f 4 (X) < +∞), and A5 (2 ≤ p, 2 ≤ q) hold then we have (i) the minimum variance (best) unbiased estimator of Du for a given ( p, q) and m is given by u = D
2 2 mp ( p − 1)q(q − 1) i=1 k=1 l=1 i 1 >l ⎞ ⎛ p ⎟ ⎜ ( j1 ) (k) (l) (l) ⎟ f (X ×⎜ , X ) − f (X , X ) i,u i,∼u i,∼u i,u ⎠ ⎝ m
⎛
p q−1 q
j1 =1 j1 =k
⎞ p ⎟ ⎜ ( j2 ) (k) (i 1 ) (i 1 ) ⎟ f (X ×⎜ , X ) − f (X , X ) i,u i,∼u i,∼u ⎠ ; i,u ⎝
(3.7)
j2 =1 j2 =k
u is given by (ii) the variance of D u ) = V( D Moreover, we have
2 σ p,q
m
.
2 u − Du 2 = σ p,q ; mE D
(3.8)
(3.9)
123
M. Lamboni
u follows asymptotically a normal distribution and we have (iii) if m → +∞, D D √ 2 u − Du − . m D → N 0, σ p,q
(3.10)
Proof We can see that the minimum value of variance in (3.5) is obtained when 0the n 1 = p and n 2 = q. Indeed, when p = n 1 , p−i is zero except for i = p, and when 0 is zero except for j = q.
q = n 2 , q− j Remark 4 The case where p = n 1 = 2, 3 and q = n 2 = 2. u becomes For p = n 1 = 2, the expression of D (1) (1) (2) (1) u = 1 f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) D 2m i=1 (1) (2) (2) (2) × f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) . m
(3.11)
u becomes For p = n 1 = 3, the expression of D
u = D
m 1 (1) (1) (2) (1) (3) (1) 2 f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) 18m i=1 (1) (2) (2) (2) (3) (2) , Xi,∼u ) − f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) × 2 f (Xi,u m 1 (2) (1) (1) (1) (3) (1) 2 f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) × 18m i=1 (2) (2) (1) (2) (3) (2) × 2 f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u )
+
m 1 (3) (1) (1) (1) (2) (1) 2 f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) 18m i=1 (3) (2) (1) (2) (2) (2) (3.12) × 2 f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) .
+
Corollary 2 gives an interesting estimator of the (non-normalized) first-order index of Xu . Based on these results, the theorem below provides the generalized estimator of the first-order index of Xu (Su ). Theorem 2 The case where n 1 =p and n 2 = q. If Assumptions A1, A3 (E f 4 (X) < +∞), and A5 (2 ≤ p, 2 ≤ q) hold then the estimator of the first-order index of Xu for a given ( p, q) and m is given by
123
Uncertainty quantification: a minimun variance unbiased...
Su =
2 2 ( p − 1)q(q − 1) Dmp i=1 k=1 l=1 i 1 >l ⎞ ⎛ p ⎟ ⎜ ( j1 ) (k) (l) (l) ⎟ f (X ×⎜ , X ) − f (X , X ) i,u i,∼u i,∼u i,u ⎠ ⎝ p q−1 q
m
j1 =1 j1 =k
⎞ p ⎟ ⎜ ( j2 ) (k) (i 1 ) (i 1 ) ⎟ f (X , X ) − f (X , X ) ×⎜ i,u i,∼u i,∼u ⎠ , i,u ⎝ ⎛
(3.13)
j2 =1 j2 =k
the estimator of the model variance D. with D (i) The estimator Su is consistent, that is, P → Su . Su −
(3.14)
(ii) if m → +∞, N → +∞, and m/N → 0, with N the number of model runs used for estimating the model variance D, Su follows asymptotically a normal distribution, that is, D √ m Su − Su − →N
0,
2 σ p,q
D2
.
(3.15)
Proof The definition of the estimator is obvious bearing in mind the definition of the first-order index in Eq. (2.3). P − → D, and Slutsky’s Point (i) is obvious by combining Eq. (3.9), the fact that D theorem. For point (ii), first, bearing in mind Slutsky’s theorem and using Eq. (3.10), we can write √
2 σ p,q Du Du D m − → N 0, 2 . − D D D
√ D √ u u Du Du and m − − Second, we can show that m DD D are asymptotically D D equivalent and therefore have the same asymptotic distribution. Indeed, √ P √ √ m Du − Du − m Du − Du = m Du 1 − 1 − → 0, D D D D D D because, for high values of m and N , the probability
123
M. Lamboni
! " √ 1 m 1 1 1 2 P m − ≥ ≤ 2E − D D D D
≤ as √
m C m,N →+∞ −−−−−−→ 0, 2 N
D 1 1 N D − → N (0, C), which is obtained by applying delta method to − D D − D − N D → N (0, C0 ).
√
In SA and MSA, we are interested in computing both the first-order and total indices. While Theorem 2 provides an efficient estimator of the first-order index, it is worth interesting to propose a joint estimator of both indices using the same model evaluations. 3.3 Joint estimator of the first-order and total indices This section aims at providing a joint estimator of Sobol’ indices. The model evaluations, performed to estimate the first-order index in (3.13), are sufficient to estimate the total sensitivity index (TSI) using the generalized and efficient estimator of the TSI proposed in Lamboni (2016a). This estimator is given by ⎛ S Tu =
1 2 ( p − 1) Dmp
p m i=1 k=1
⎜ ⎜ ⎝
p j1 =1 j1 =k
⎞2 ⎟ (j ) (k) (l) (l) f (Xi,u , Xi,∼u ) − f (Xi,u1 , Xi,∼u ) ⎟ ⎠ ,
(3.16) the estimator of the model variance D. with D The following theorems provide the joint estimators of the non-normalized indices and the Sobol indices. To accomplish this, we use K T (·) for the kernel that allows for estimating the TSI. It is given by ( p) (q) (1) K T X(1) , . . . , X , X , . . . , X u ∼u u ∼u ⎛ ⎞2 p q p ⎟ ⎜ 1 ( j1 ) (k) (l) (l) ⎟ ⎜ = f (X , X ) − f (X , X ) u u ∼u ∼u ⎠ . (3.17) ⎝ qp 2 ( p − 1) l=1 k=1
j1 =1 j1 =k
The kernel K T (·) is an average of the kernel proposed in Lamboni (2016a) over (l) X∼u , l = 1, . . . , q to make use of all model runs performed to estimate the first-order ( p) (q) (1) (1) index. We can see that K T Xu , . . . , Xu , X∼u , . . . , X∼u is symmetric with respect to its first and second arguments. It is an unbiased estimator of the non-normalized TSI of Xu (comprehensive details are in Appendix B and in Lamboni 2016a).
123
Uncertainty quantification: a minimun variance unbiased...
We use η ST = Cov [K (·), K T (·)] for the covariance between the kernels K (·) in (3.1) and K T (·) in (3.17). Based on the kernel K T (·), we propose the following U-statistic ⎛ ⎞2 Dutot =
⎟ ⎜ 1 ( j1 ) (k) (l) (l) ⎜ ⎟ f (X , X ) − f (X , X ) i,u i,∼u i,∼u ⎠ . i,u ⎝ mqp 2 ( p − 1) m
q
p
i=1 l=1 k=1
p
j1 =1 j1 =k
(3.18) Bearing in mind the U-statistics in (3.7) and (3.18), we are going to derive the joint estimator of Sobol’ indices. We start with the non-normalized indices. Corollary 3 If Assumptions A1, A3 (E f 4 (X) < +∞), and A5 (2 ≤ p, 2 ≤ q) hold then we have (i) the minimum variance (best) unbiased estimator of the non-normalized indices T Du , Dutot for a given ( p, q) and m is given by T T u , Du , Dutot = D Dutot ;
(3.19)
T (ii) the estimator Du , Dutot is consistent, that is, T P T Du , Dutot − → Du , Dutot ; (iii) if m → +∞, we have T D T √ m Du , Dutot − Du , Dutot − → N (0, Σ) ,
(3.20)
(3.21)
with Σ the covariance between the two kernels (Σ11 = V(K (·)), Σ22 = V(K T (·)), and Σ12 = η ST ). Proof The proofs of Points (i) and (ii) require two steps in the same spirit like what u . First, we define the multivariate kernel of degree ( p, q) as follows: we did for D ⎡ ⎤ ( p) (q) (1) (1) K X , . . . , X , X , . . . , X u ∼u u ∼u ( p) (q) (1) ⎣ ⎦, K˜ X(1) u , . . . , Xu , X∼u , . . . , X∼u = ( p) (q) (1) (1) K T Xu , . . . , Xu , X∼u , . . . , X∼u which is symmetric with respect to its first and second arguments. Then, we write a first corollary similar to Corollary 1 and a second one similar to Corollary 2 to obtain the results bearing in mind the theory of U-statistics, Corollary 2 from this paper, and Corollary 2 from Lamboni (2016a) (see Appendix B). T For point (iii), the asymptotic normality of Du , Dutot comes from Theorem 2.1 in Sugiura (1965), which deals with the asymptotic normality of the multivariate and multi-sample U-statistics.
123
M. Lamboni
From Corollary 3, we can derive the statistical properties of the joint estimator of the Sobol indices (Su , STu ). Corollary 4 gives these results. Corollary 4 If Assumptions A1, A3 (E f 4 (X) < +∞), and A5 (2 ≤ p, 2 ≤ q) hold T then the estimator of indices Su , STu for a given ( p, q) and m is given by T T 1 Su , STu = Du , Dutot , D
(3.22)
the estimator of the model variance D. with D T (i) The estimator Su , STu is consistent, that is, T P T Su , STu − → Su , STu ;
(3.23)
(ii) if m → +∞, N → +∞, and m/N → 0, with N the number of model runs used for estimating the model variance D, we have T T D √ Σ m Su , STu − Su , STu − → N 0, 2 , D
(3.24)
with Σ the covariance between the two kernels (Σ11 = V(K (·)), Σ22 = V(K T (·)), and Σ12 = η ST ). Proof The proof is similar to the proof of Theorem 2 bearing in mind Corollary 3 and the Slutsky theorem.
Remark 5 For p = q = 2, the expression of the joint estimator of the Sobol indices is given by Su STu ⎤ ⎡ 1 m (1) (1) (2) (1) (1) (2) (2) (2) f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) × f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) i=1 ⎥ ⎢ 2m D =⎣ 2 2 ⎦ . (1) (1) (2) (1) (1) (2) (2) (2) 1 m f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) + f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u ) i=1 4m D
(3.25)
4 Proxy measure for the choice of the degree of a kernel This section aims to choose the kernel that minimizes the variance of the estimator of the first-order index. As the variance of the kernel involves fourth moments, which are often unknown and hard to estimate (Owen 2013b), we use a proxy measure for choosing the convenient degree of the kernel. A Proxy measure is a part of the upper bound of the variance of an estimator that is based on known coefficients (see Owen 2013b for comprehensive details). Thus, it is a simple formula that approximates the variance of the estimator of the first-order index.
123
Uncertainty quantification: a minimun variance unbiased...
4.1 Proxy measure for the variance of the estimator of the first-order index A classical way of overcoming the estimation of fourth moments during the estimation of a variance is based on the minimum-norm quadratic estimation (MINQE) or the MINQUE and MINQIE versions using unbiasedness or invariance as constraints (see Rao and Kleffe 1988). Owen (2013b) generalized the principle of MINQE to deal with the variances of the estimators of sensitivity indices and Lamboni (2016b) used MINQE’s approach to identify a kernel for estimating the total index. First,the idea of MINQE consists in expressing the estimator of a sensitivity index as Tr Ω T Θˆ for a given matrix Ω. The components of the matrix Θˆ are the product of functions f (Xu , X∼u ) and f (Xv , X∼v ), with u, v ⊆ {1, 2, . . . , d} (see Owen 2013b; Lamboni 2016a for a comprehensive treatment). Second, a proxy measure for the variance of the estimator is given by
⎛
V (Ω) = ⎝
⎞2
|Ω u,v |⎠ .
(4.1)
u⊆{1,2,...,d} v⊆{1,2,...,d}
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
2.5
3.0
3.5
−2
4.0
2.5
Degree (p, q) = (3, 3)
3.0
3.5
4.0
2.5
3.0
Degree (p, q) = (4, 3)
3.5
4.0
Degree (p, q) = (4, 4)
2.0
2.5
3.0
3.5
−1 −2
−2
−2
−1
−1
0
0
0
1
1
1
2
2
2
3
log of RMSE
−2
−2
0
0
0
2
2
2
4
4
Degree (p, q) = (5, 3)
2.0
2.5
3.0
3.5
4.0
2.5
3.0
3.5
4.0
log10 of Total number of Model evaluations
Fig. 1 First-order indices: log-RMSEs of the model in (5.1) against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
M. Lamboni
4.2 Choice of the convenient degree of the kernel k For integers j1 , j2 , k, l, and i, let Θˆ j1 ,l be a matrix of size p × (q − l), with k Θˆ j1 ,l
(j )
j2 ,i
(j )
(i) 2 = f (Xu 1 , X(l) ∼u ) f (Xu , X∼u ),
j2 = 1, . . . , p, and i = l + 1, . . . , q. (k) (k) (k) (k) T We set r(k) as a vector of size p, with ri(k) = −1 if • j1 = r1 r j1 , . . . , r p r j1 k k k (k) i = k and rk = p − 1 otherwise, Θˆ = diag Θˆ 1,1 , . . . , Θˆ p,q−1 as a block (k)T T diagonal matrix of size p 2 × q(q − 1)/2, and r(k) = r(k)T as a vector •1 , . . . , r• p of size p 2 . 1 p Finally, we use Θˆ = diag(Θˆ , . . . , Θˆ ) as a block diagonal matrix of size p 3 × (1)T T , . . . , r( p)T as a vector of size p 3 , and s = [1, . . . , 1]T as pq(q − 1)/2, r = r a vector of size qp(q − 1)/2.
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010
−2
Degree (p, q) = (5, 5)
−7
−7
−6
−6
−5
−5
−4
−4
−3
−3
−2
−2 −3 −4 −5 −6 −7
2.5
3.0
3.5
4.0
2.5
3.0
3.5
4.0
2.5
3.0
Degree (p, q) = (4, 3)
3.5
4.0
Degree (p, q) = (4, 4)
−2
−2
Degree (p, q) = (3, 3)
2.0
2.5
3.0
3.5
−7
−6
−5
−6
−5
−4
−5
−4
−4
−3
−3
−3
−2
log of RMSE
Degree (p, q) = (5, 4)
−1
Degree (p, q) = (5, 3)
2.5
3.0
3.5
2.5
3.0
3.5
4.0
log10 of Total number of Model evaluations
Fig. 2 First-order indices: log-RMSEs of Ishigami’s function against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
Uncertainty quantification: a minimun variance unbiased...
The kernel K (·) in (3.1) can be written as follows: ( p) (q) (1) K X(1) u , . . . , Xu , X∼u , . . . , X∼u =
q−1 q p 2 p 2 ( p − 1)q(q − 1)
k=1 l=1 i=l+1
×
p p j1 =1 j2 =1
(j )
(k) (k)
(j )
(i) 2 r j1 r j2 f (Xu 1 , X(l) ∼u ) f (Xu , X∼u ),
(4.2)
or as a bi-linear form, that is, ( p) (q) (1) , . . . , X , X , . . . , X K X(1) u ∼u = u ∼u
2 ˆ r T Θs − 1)q(q − 1) 2 = 2 Tr sr T Θˆ . (4.3) p ( p − 1)q(q − 1) p2 ( p
Using the definition of the proxy measure in (4.1), the proxy measure for the variance of the kernel K (·) is given by V (Ω) = V
2 sr T p 2 ( p − 1)q(q − 1)
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
3.0
3.5
4.0
4.5
−4 −7
log of RMSE
−7
−6
−6
−6
−5
−5
−5
−4
−4
−3
−3
−3
Degree (p, q) = (5, 3)
3.0
3.5
4.0
4.5
3.0
3.5
4.0
4.5
Degree (p, q) = (4, 4)
−6
−7
−7
−6
−6
−5
−5
−5
−4
−4
−4
−3
−3
−3
Degree (p, q) = (4, 3)
−2
Degree (p, q) = (3, 3)
2.5
3.0
3.5
4.0
3.0
3.5
4.0
3.0
3.5
4.0
4.5
log10 of Total number of Model evaluations
Fig. 3 First-order indices: log-RMSEs of Sobol’s function of type A against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
M. Lamboni
4 V sr T p 4 ( p − 1)2 q 2 (q − 1)2 ⎞2 ⎛ p3 qp(q − 1) 4 ⎝ |ri |⎠ = 4 p ( p − 1)2 q 2 (q − 1)2 2
=
=
1 p 2 ( p − 1)2
i=1
⎞2 ⎛ 3 p ⎝ |ri |⎠ ,
(4.4)
i=1
where Ω is the known coefficients in (4.3) and ri is the ith coordinate of the vector r. It comes out that the proxy measure does not depend on the value of q. Thus, for a given value of p, any value of q will yield to the same value of the proxy measure of the variance. This result is probably in line with the expression of the estimator. Indeed, (l) the estimator can be seen as an average of the kernel over X∼u , l = 1, 2, . . . , q. The value of q = 2 should be used as the referenced value, and we should expect to have good results with q ≤ p. We can see that the value of V (Ω) increases with p (V |Ω] = 16 if p = 2 and V [Ω] = 64 if p = 3), and we should use small values of p. In the following text, we
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
3.0
3.5
4.0
4.5
−8
log of RMSE
−7
−8
−7
−6
−6
−6
−5
−5
−4
−4
−4
−3
−3
−2
−2
−2
Degree (p, q) = (5, 3)
3.0
3.5
4.0
4.5
3.0
3.5
Degree (p, q) = (4, 3)
4.0
4.5
Degree (p, q) = (4, 4)
−7
−7
−7
−6
−6
−6
−5
−5
−5
−4
−4
−4
−3
−3
−3
−2
−2
−2
−1
Degree (p, q) = (3, 3)
2.5
3.0
3.5
4.0
3.0
3.5
4.0
3.0
3.5
4.0
4.5
log10 of Total number of Model evaluations
Fig. 4 First-order indices: log-RMSEs of Sobol’s function of type B against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
Uncertainty quantification: a minimun variance unbiased...
use 2 as the referenced value of p. A similar result ( p = 2) was established for the estimator of the total index (Lamboni 2016a).
5 Numerical tests In this section, we perform some numerical tests to compare the effectiveness of our estimations depending on the degree ( p, q) of the symmetric kernels. We present the following: (i) the algorithm for computing the first-order and total indices, (ii) the functions used to illustrate our approach, (iii) the main choices made to obtain the results, and (iv) the numerical results.
5.1 Design scheme and computational cost of the d indices Algorithm 5 For a given degree ( p, q) and a sample size m, the following steps are used to compute the d first-order indices (S j , j = 1, . . . , d). (i) Sample p input values (matrices) of size m × d (X1 , . . . , X p ). (ii) For any input factor X j , replace the jth column of X1 with the jth column of X2 , . . . , X p to obtain p − 1 new matrices (X2 j , . . . , X pj ). RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
−1 −2 −3 −4 −6
−5 −6
−5 −6
3.0
3.5
4.0
4.5
3.0
3.5
4.0
4.5
3.0
3.5
Degree (p, q) = (4, 3)
4.0
4.5
Degree (p, q) = (4, 4)
−1 −3 −4 −5
−5
−5
−4
−4
−3
−3
−2
−2
−2
−1
0
0
Degree (p, q) = (3, 3)
−1
log of RMSE
−5
−4
−4
−3
−3
−2
−2
−1
−1
0
0
0
Degree (p, q) = (5, 3)
2.5
3.0
3.5
4.0
3.0
3.5
4.0
3.0
3.5
4.0
4.5
log10 of Total number of Model evaluations
Fig. 5 First-order indices: log-RMSEs of Sobol’s function of type C against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
M. Lamboni
(iii) Run the model for X1 and for the p − 1 input values X2 j , . . . , X pj to obtain p outputs (vectors of size m): y1 , y2 j , . . . , y pj . (iv) Repeat steps (ii) and (iii) q − 1 times by replacing X1 with X2 , X3 , . . ., Xq .
The model runs performed in Algorithm 5 are used for computing the d first-order indices using Eqs. (3.13, 3.22). The same model evaluations are sufficient to estimate the d total sensitivity indices (TSIs) using Eqs. (3.18, 3.22). The number of model evaluations in Algorithm 5, step (iii) is m +m ×( p −1)×d = m [( p − 1) × d + 1]), as y1 is used for the computation of each index. Thus, the computational cost or the total number of model runs for the computation of the d indices is mq [( p − 1)d + 1]), as we repeat step (i)–(iii) q times.
Remark 6 We can use the same algorithm to compute the sensitivity indices of a subset of input factors such as (Su , STu ), with |u| > 1. We should modify the step (ii) of Algorithm 5 as follows: replace the columns of X1 , which indices are in u, with the same columns of X2 , . . . , X p to obtain p − 1 new matrices (X2 j , . . . , X pj ).
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010
0
10 −30
−30
−20
−20
−10
−10
0
0 −10 −20 −30 −40
4.2
4.4
4.6
4.8
5.0
4.2
4.4
4.6
4.8
5.0
5.2
4.2
4.4
4.6
4.8
4.8
5.0
5.2
10 0
0
−10 −20
−20
−30
−30 4.0
4.6
Degree (p, q) = (4, 4)
−10
0 −10 −20 −30 3.8
4.4
Degree (p, q) = (4, 3) 10
Degree (p, q) = (3, 3) 10
log of RMSE
Degree (p, q) = (5, 5) 10
Degree (p, q) = (5, 4)
10
Degree (p, q) = (5, 3)
4.0
4.2
4.4
4.6
4.8
5.0
4.2
4.4
4.6
4.8
5.0
log10 of Total number of Model evaluations
Fig. 6 First-order indices: log-RMSEs of the model in (5.4) against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
Uncertainty quantification: a minimun variance unbiased...
5.2 Test functions To illustrate our approach, we consider three types of functions as follows: functions with a small number of inputs (d = 2, d = 3), functions with a medium number of inputs (d = 10), and functions with a high number of inputs (d = 21).
5.2.1 The product-exponential function (d = 2) The product-exponential function (Borgonovo et al. 2014) includes two independent inputs following a normal distribution N (0, 1). It is defined as follows:
f (x) = exp(x1 + 2x2 ).
(5.1)
For this function, the sensitivity indices are S1 = 0.012, S2 = 0.364, ST1 = 0.637, and ST2 = 0.989. This function belongs to the class of functions with important interactions among input factors.
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
−2
−2
2.5
3.0
3.5
4.0
2.5
3.0
3.5
4.0
2.5
3.0
Degree (p, q) = (4, 3)
3.5
4.0
Degree (p, q) = (4, 4)
2
Degree (p, q) = (3, 3)
−1
−2
−2
−2
−1
−1
0
0
0
1
1
1
2
2
log of RMSE
−2
−1
0
−1
0
0
2
1
1
2
4
2
Degree (p, q) = (5, 3)
2.0
2.5
3.0
3.5
2.0
2.5
3.0
3.5
4.0
2.5
3.0
3.5
4.0
log10 of Total number of Model evaluations
Fig. 7 Total indices: Log-RMSEs of the model in (5.1) against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
M. Lamboni
5.2.2 Ishigami’s function (d = 3) The Ishigami function includes three independent input factors following a uniform distribution on [−π, π ]. It is defined as follows: f (x) = sin(x1 ) + 7 sin2 (x2 ) + 0.1x34 sin(x1 ).
(5.2)
For this function, the sensitivity indices are S1 = 0.3139, S2 = 0.4424, S3 = 0.0, ST1 = 0.567, ST2 = 0.442, and ST3 = 0.243. 5.2.3 Sobol’s function (d = 10) The Sobol function (Homma and Saltelli 1996) includes ten independent input factors following a uniform distribution on [0, 1]. It is defined as follows: f (x) =
d=10 # j=1
|4x j − 2| + a j . 1 + aj
(5.3)
According to the values of a = (a j , j = 1, 2, . . . , d), this function has different properties (Kucherenko et al. 2009): RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5) −3 −6
−7
2.5
3.0
3.5
4.0
2.5
3.0
3.5
4.0
2.5
3.0
Degree (p, q) = (4, 3)
3.5
4.0
Degree (p, q) = (4, 4)
−2
Degree (p, q) = (3, 3)
−4 −6
−5
−6
−5
−5
−4
−4
−3
−3
−3
−2
log of RMSE
−6
−6
−5
−5
−5
−4
−4
−4
−3
−3
Degree (p, q) = (5, 3)
2.0
2.5
3.0
3.5
2.5
3.0
3.5
2.5
3.0
3.5
4.0
log10 of Total number of Model evaluations
Fig. 8 Total indices: log-RMSEs of Ishigami’s function against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
Uncertainty quantification: a minimun variance unbiased...
– if a = [0, 0, 6.52, 6.52, 6.52, 6.52, 6.52, 6.52, 6.52, 6.52]T , the values of sensitivity indices are S1 = S2 = 0.39, S j = 0.0069, ∀ j > 2, ST1 = ST2 = 0.54, and ST j = 0.013, ∀ j > 2. Thus, only a few inputs are important, and the function has a low effective dimension (function of type A); – if a = [50, 50, 50, 50, 50, 50, 50, 50, 50, 50]T , the first and total indices are S j = ST j = 0.1, ∀ j ∈ {1, 2, . . . , d}. Thus, all input factors are important, but there is no interaction among these inputs. The function has a high effective dimension (function of type B); – if a = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]T , the function belongs to the class of functions with important interactions among input factors. In fact, we have S j = 0.02 and ST j = 0.27, ∀ j ∈ {1, 2, . . . , d}. Due to these important interactions, all input factors are important. The function has a high effective dimension (function of type C).
5.2.4 The product-polynomial function (d = 21) The product-polynomial function (Plischke et al. 2013; Borgonovo et al. 2014) includes 21 independent inputs following a log-normal distribution LN (1, 1). It is
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
−4 −5
−8 3.0
3.5
4.0
4.5
3.0
3.5
4.0
4.5
3.0
3.5
Degree (p, q) = (4, 3)
4.0
4.5
−3
Degree (p, q) = (4, 4)
−7
−7
−7
−6
−6
−6
−5
−5
−5
−4
−4
−3
−3
Degree (p, q) = (3, 3)
−4
log of RMSE
−7
−7
−7
−6
−6
−6
−5
−5
−4
−4
−3
−3
−3
Degree (p, q) = (5, 3)
2.5
3.0
3.5
4.0
3.0
3.5
4.0
3.0
3.5
4.0
4.5
log10 of Total number of Model evaluations
Fig. 9 Total indices: log-RMSEs of Sobol’s function of type A against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
M. Lamboni
defined as follows: f (x) =
21 #
a
xj j,
(5.4)
j=1
with a j = 4 ∀ j = 1, . . . 7, a j = 2 ∀ j = 8, . . . 14, and a j = 1 ∀ j = 15, . . . 21. The model in (5.4) has a high effective dimension (function of type C), as S j = 1.28×10−57 if j = 1, . . . , 7, S j = 7.72 × 10−63 if j = 8, . . . , 14, S j = 2.48 × 10−64 if j = 15, . . . , 21, ST j = 0.983 if j = 1, . . . , 7, ST j = 0.967 if j = 8, . . . , 14, and ST j = 0.622 if j = 15, . . . , 21. 5.3 Implementation issues We estimated the indices using Sobol’s design from the R-package randtoolbox (Dutang and Savicky 2013). We considered six values of the degree, as follows: ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), and ( p = 5, q = 5). We used ( p = 2, q = 2) as the referenced value of the degree, and we added the degree ( p, q = 2) to assess the numerical impact of q. For a given value of the degree ( p, q), we increased the sample size (m) by 30 from 5 up to 500 depending on the convergence of estimations. For the model in (5.4), we increased m by 50 from 50 up to 500. RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 5)
log of RMSE
−8
−8
−8
−7
−7
−6
−6
−6
−5
−5
−4
−4
−4
−3
−3
Degree (p, q) = (5, 4)
−2
Degree (p, q) = (5, 3)
3.0
3.5
4.0
4.5
3.0
3.5
4.0
4.5
3.0
3.5
Degree (p, q) = (4, 3)
4.0
4.5
Degree (p, q) = (4, 4)
−5 −8
−8
−7
−7
−7
−6
−6
−6
−5
−5
−4
−4
−4
−3
−3
−3
Degree (p, q) = (3, 3)
2.5
3.0
3.5
4.0
3.0
3.5
4.0
3.0
3.5
4.0
4.5
log10 of Total number of Model evaluations
Fig. 10 Total indices: log-RMSEs of Sobol’s function of type B against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
Uncertainty quantification: a minimun variance unbiased...
For a fair comparison, we computed the indices for degrees ( p, q), ( p = 2, q = 2), and ( p, q = 2) using the same number of model evaluations. Indeed, as qm[( p−1)d + 1] is the computational cost for a given degree ( p, q) (see Sect. 5.1), the sample size for the referenced value of the degree in (3.11) is m 2 = r ound(qm[( p−1)d+1]/(2d+2)). We also added the estimates of the sensitivity indices from Saltelli et al. (2010), implemented in the R-package sensitivity (Pujol et al. 2013). We used the root mean square error (RMSE) to assess the accuracy of our estimations. For each sample size (m) and for each degree ( p, q), we replicated the process of computing the indices R = 30 times (changing the seed randomly when sampling input values). The average RMSE of all the first-order indices is defined as follows: $ d % R 2 %1 1 & S j,r − S j , R M S Ed = d R
(5.5)
r =1
j=1
where S j and S j,r are the true and the estimated values of the first-order index respectively. Furthermore, S j,r is the first-order estimate for a given replication r . We used the same expression of RMSE for the total indices.
RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
Degree (p, q) = (5, 5)
log of RMSE
−1.5 −2.0 −2.5 −3.0
−3.5
−3.0
−3.0
−2.5
−2.5
−2.0
−2.0
−1.5
−1.5
Degree (p, q) = (5, 3)
3.0
3.5
4.0
4.5
3.0
3.5
4.0
4.5
3.0
3.5
Degree (p, q) = (4, 3)
4.0
4.5
Degree (p, q) = (4, 4)
−2.0
−3.0
−2.5
−2.5
−2.5
−2.0
−1.5
−2.0
−1.0
−1.5
−1.5
Degree (p, q) = (3, 3)
2.5
3.0
3.5
4.0
3.0
3.5
4.0
3.0
3.5
4.0
4.5
log10 of Total number of Model evaluations
Fig. 11 Total indices: log-RMSEs of Sobol’s function of type C against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
M. Lamboni
5.4 Numerical results and discussion 5.4.1 Results: estimations of the first-order indices Figures 1, 2, 3, 4, 5, and 6 show the average of the RMSEs of the d first-order indices when the total number of the model evaluations increases for the functions in (5.1), (5.2), (5.3) with the three types (Sobol’ functions of type A, B, and C), and (5.4) respectively. The figures show the trends of the average RMSEs for the six values of the degree ( p, q) compared to the RMSEs associated with the referenced value of the degree ( p = 2, q = 2) and the degree ( p, q = 2). We also added the RMSEs associated with the estimator from Saltelli et al. (2010). In Figs. 1, 2, 3, 4 and 5, the RMSEs associated with our estimators decrease with the number of model evaluations for different values of the degree, and we have converging estimations. In Fig. 6, the RMSEs are small and increase sometimes with the number of model runs for some values of the degree. In all figures, our estimators outperform the estimator from Saltelli et al. (2010), with significant difference observed in Figs. 1, 5, and 6. Moreover, Fig. 1 shows that the trends of the RMSEs associated with the estimator from Saltelli et al. (2010) do not decrease with the total number of model evaluations. Thus, this estimator fails to converge. Similar results were obtained for the model in (5.4), and these results are in line with the conclusions made in Borgonovo et al. (2014). RMSE for degree (p,q) RMSE for degree (p,2) RMSE for degree (2,2) RMSE from Saltelli et al 2010 Degree (p, q) = (5, 4)
10
5
5
10
15 10 5
4.4
4.6
4.8
5.0
0
0
0
4.2
4.2
4.4
4.6
4.8
5.0
5.2
4.2
4.4
4.6
4.8
4.8
5.0
5.2
10
15
5 0
5 0 4.0
4.6
Degree (p, q) = (4, 4)
10
15 5 0 3.8
4.4
Degree (p, q) = (4, 3) 15
Degree (p, q) = (3, 3)
10
log of RMSE
Degree (p, q) = (5, 5)
15
Degree (p, q) = (5, 3)
4.0
4.2
4.4
4.6
4.8
5.0
4.2
4.4
4.6
4.8
5.0
log10 of Total number of Model evaluations
Fig. 12 Total indices: log-RMSEs of the model in (5.4) against the total number of model evaluations (in log10 ) for six values of the degree ( p = 3, q = 3), ( p = 4, q = 3), ( p = 4, q = 4), ( p = 5, q = 3), ( p = 5, q = 4), ( p = 5, q = 5). For each degree, we show the corresponding RMSE (solid line), the RMSE for degree ( p, q = 2) (dashed line), the RMSE for degree (2, 2) (dotted line), and the RMSE from Saltelli et al. (2010) (dash-dotted line)
123
Uncertainty quantification: a minimun variance unbiased...
It can be seen in Fig. 1 that the kernel of degree ( p, q) performs better compared to the others (degrees ( p, q = 2) and ( p = 2, q = 2)), while Fig. 5 shows that the kernel of degree ( p, q = 2) is the best among the six kernels, followed by the kernel of degree ( p, q). For the remaining models (Figs. 2, 3, 4, 6), the referenced value of the degree ( p = 2, q = 2) performs generally better compared to the others. Regarding the values of the sensitivity indices for the model in (5.4), our estimations are far from the true values. We think that this result is probably due to the difficulty of estimating small values using classical statistics and/or to the difficulty involved in high-dimensional sampling of skewed or heavy-tailed distributions of input factors. Moreover, the variance of the model in (5.4) is about 1.752 × 10170 , which is very high. Of course, this value is finite, but it is practically difficult to estimate this value using classical estimators of the variance. 5.4.2 Results: estimations of the total indices Figures 7, 8, 9, 10, 11, and 12 show the average of the RMSEs of the d total indices when the total number of the model evaluations increases for the functions in (5.1), (5.2), (5.3) with the three types (Sobol’ functions of type A, B, and C), and (5.4) respectively. All the figures show the trends of the average RMSEs for the six values of the degree ( p, q) compared to the RMSEs associated with the referenced value of the degree ( p = 2, q = 2) and the degree ( p, q = 2). We also added the RMSEs associated with the estimator from Saltelli et al. (2010). In all figures, we have converging estimations, except in Fig. 12. Our estimators outperform the estimator from Saltelli et al. (2010) in Figs. 7 and 12. While the kernel of degree ( p, q) gives the best results for the model in (5.1), the kernel of degree (2, 2) becomes the best in Fig. 8 for estimating the total indices. Furthermore, when increasing ( p, q), all the estimators give approximately the same results, including those from Saltelli et al. (2010). In Fig. 12, our estimations are better than those form Saltelli et al. (2010). But, we have the same problem like in the case of the estimates of the first-order indices. In Figs. 10 and 11, we can find a kernel that performs a bit better compared to the estimator from Saltelli et al. (2010) and vice versa. Finally, in Fig. 9, we can see that the estimations from Saltelli et al. (2010) are the best ones. For the estimation of the total indices, our estimator associated with the degree ( p = 2, q) is an average of Jansen’s estimator (Jansen 1999) used in Saltelli et al. (2010), due to the design scheme well suited for the estimation of the first-order index. Thus, with the same number of the model runs, our estimator and the Jansen estimator do not use the same information (input values) to compute the total indices, and this can lead to different estimations of the TSIs.
6 Conclusion In this paper, we investigated the estimation of Sobol’ indices (first-order and total indices) by making use of a kernel of degree ( p, q) from the theory of U-statistics. As Sobol’ indices are proportional to the non-normalized indices, we propose (i) a
123
M. Lamboni
generalized, minimum variance unbiased estimator of the non-normalized first-order and total indices, (ii) a joint efficient estimator of Sobol’ indices, and (iii) the main statistical properties of these estimators. The theory of U-statistics allows for deriving the consistency, the optimal rate of convergence, and the asymptotic distribution of our estimators. We found that the kernel of degree ( p, q) with the smallest variance should be preferred for estimating Sobol’ indices for a given input factor(s), and the degree ( p, q) can be larger than (2, 2) in practice. The numerical tests confirmed the superiority of our estimators of the first-order index compared to the estimator from Saltelli et al. (2010). The superiority is also observed for some total indices when using challenging models. In the case of the total indices, our estimator associated with the degree ( p = 2, q) is an average of the Jansen estimator (Jansen 1999) used in Saltelli et al. (2010), and the difference of the results is due to the fact that we do not use the same information to compute these indices. The numerical tests confirmed that we can obtain an efficient kernel with degree ( p = 5, q = 5). Furthermore, there is no absolutely efficient degree for all functions. Thus, some adaptive strategies are needed to properly choose the degree of the kernel, for each input factor, prior to the estimation of the indices. The strategy should be based on the variance of the kernel, but it requires more investigations. Also, input factors are assumed to be independently distributed, which is sometimes unrealistic. In the next future, it is worth interesting to adapt the estimator proposed in this paper for estimating the sensitivity indices of dependent input factors, using the four types of Sobol’ indices defined in Mara and Tarantola (2012); Kucherenko et al. (2012); Mara et al. (2015).
Appendix A: An useful summary of the U-statistics theory This section presents the main theorems about U-statistics used in this paper. While we consider only three independent samples, the general version of the theorems can be found in Ferguson (1996), Hoeffding (1948a, b) and Sugiura (1965). Let (X 1 , . . . , X n 1 ), (Y1 , . . . , Yn 2 ), and (Z 1 , . . . , Z n 3 ) be three independent samples of size n 1 , n 2 , n 3 from the distribution D1 , D2 , and D3 respectively. Suppose that our parameter of interest θ is defined as follows: θ = E K X 1 , . . . , X p ; Y1 , . . . , Yq ; Z 1 , . . . , Z r ,
(6.1)
where K (·) is a symmetric kernel w.r.t its first, second, and third arguments. It has p + q + r arguments. The U-statistic corresponding to θ is given by 1 θ = n 1 n 2 n 3 p
123
q
r
1≤i 1
K X i1 , . . . , X i p ; Y j1 , . . . , Y jq ; Z k1 , . . . , Z kr .
(6.2)
Uncertainty quantification: a minimun variance unbiased...
The estimator θ is unbiased, and its variance is given by r V θ = p
q
pn 1 − pq n 2 −q r n 3 −r i
p−i
m
i=1 j=1 k=1
j
q− j
p
q
k
n 1 n 2 n 3
r −k
σi,2 j,k ,
(6.3)
r
with σi,2 j,k = V E K X 1 , . . . , X p ; Y1 , . . . , Yq ; Z 1 , . . . , Z r |X 1 , . . . , X i ; Y1 , . . . , Y j ; Z1, . . . , Zk . The U-statistic θ is a function of ordered statistics, as it is symmetric w.r.t each of its three types of arguments. It is known that ordered statistics are sufficient and complete statistics for non-parametric families such as the class of distribution functions having finite fourth moments. By Lehmann–Scheff theorem, it follows that θ is the unique, uniformly minimum variance unbiased estimator of θ , given X 1 , . . . , X n 1 ; Y1 , . . . , Yn 2 ; Z 1 , . . . , Z n 3 . We also have the asymptotic normality of the estimator θ. The same results can be derived for the multivariate U-statistics, and all elements can be found in Ferguson (1996), Lehmann (1999), Hoeffding (1948a, b) and Sugiura (1965).
Appendix B: a generalized estimator of the total sensitivity index This section recalls mainly Corollary 2 from Lamboni (2016a). ( p) For p ≥ 2, let X = (X(1) u , . . . , Xu ) be p i.i.d copies of Xu and Y = X∼u . We consider the following symmetric kernel. ⎛ ( p) K X(1) u , . . . , Xu , X∼u =
1 p 2 ( p − 1)
p k=1
⎜ ⎜ ⎝
⎞2 p j=1 j =k
⎟ ( j) ⎟ [ f (X(k) u , X∼u ) − f (Xu , X∼u )]⎠ .
(6.4) ( p) is the non-normalized , . . . , X , X It is known that the expectation of K X(1) ∼u u u total index, that is, ( p) . (6.5) Dutot = E K X(1) , . . . , X , X ∼u u u (1)
( p)
Given two independent samples Xi = (Xi,u , . . . , Xi,u ) and Yi = (Xi,∼u ), i = 1, 2, . . . , m, from X and Y respectively, the U-statistic corresponding to the kernel in (6.4 ) is given by ⎛ utot = D
p m
1 mp 2 ( p
− 1)
i=1 k=1
⎜ ⎜ ⎝
⎞2 p j=1 j =k
⎟ ( j) (k) [ f (Xi,u , Xi,∼u ) − f (Xi,u , Xi,∼u )]⎟ ⎠ .
(6.6)
123
M. Lamboni
The properties of Dutot are given as follows: (i) a MVUE of Dutot is Dutot defined in (6.6); utot is given by (ii) the variance of D utot ) = V( D
2 σ p,1
m
,
2 the variance of the kernel. with σ p,1 Moreover, we have tot 2 u − Dutot 2 = σ p,1 mE D ;
(6.7)
(6.8)
(iii) if m → +∞, we have D √ tot 2 u − Dutot − . m D → N 0, σ p,1
(6.9)
References Antoniadis A, Pasanisi A (2012) Modeling of computer experiments for uncertainty propagation and sensitivity analysis. Stat Comput 22(3):677–679. https://doi.org/10.1007/s11222-011-9282-8 Bolado-Lavin R, Castaings W, Tarantola S (2009) Contribution to the sample mean plot for graphical and numerical sensitivity analysis. Reliab Eng Syst Saf 94(6):1041–1049 Borgonovo E, Tarantola S, Plischke E, Morris MD (2014) Transformations and invariance in the sensitivity analysis of computer experiments. J R Stat Soc Ser B 76(5):925–947 Buzzard GT (2012) Global sensitivity analysis using sparse grid interpolation and polynomial chaos. Reliab Eng Syst Saf 107:82–89 Caflisch RE, Morokoff W, Owen AB (1997) Valuation of mortgage backed securities using brownian bridges to reduce effective dimension. J Comput Financ 1:27–46 Chan K, Saltelli A, Tarantola S (2000) Winding stairs: a sampling tool to compute sensitivity indices. Stat Comput 10(3):187–196. https://doi.org/10.1023/A:1008950625967 Conti S, O’Hagan A (2010) Bayesian emulation of complex multi-output and dynamic computer models. J Stat Plan Inference 140(3):640–651 Dutang C, Savicky P (2013) randtoolbox: generating and testing random numbers. R package version 1:13 Fang S, Gertner GZ, Shinkareva S, Wang G, Anderson A (2003) Improved generalized Fourier amplitude sensitivity test (FAST) for model assessment. Stat Comput 13(3):221–226. https://doi.org/10.1023/ A:1024266632666 Ferguson TS (1996) A course in large sample theory. Chapman-Hall, New York Gamboa F, Janon A, Klein T, Lagnoux A (2014) Sensitivity indices for multivariate outputs. Comptes Rendus de l’Académie des Sciences 351(7–8):307–310 Ghanem R, Higdon D, Owhadi H (2017) Handbook of uncertainty quantification. Springer, Cham Hoeffding W (1948a) A class of statistics with asymptotically normal distribution. Ann Math Stat 19:293– 325 Hoeffding W (1948b) A non-parametric test for independence. Ann Math Stat 19:546–557 Homma T, Saltelli A (1996) Importance measures in global sensitivity analysis of nonlinear models. Reliab Eng Syst Saf 52:1–17 Jansen MJW (1999) Analysis of variance designs for model output. Comput Phys Commun 117:35–43 Jourdan A (2012) Global sensitivity analysis using complex linear models. Stat Comput 22(3):823–831. https://doi.org/10.1007/s11222-011-9239-y Kucherenko S, Rodriguez-Fernandez M, Pantelides C, Shah N (2009) Monte Carlo evaluation of derivativebased global sensitivity measures. Reliab Eng Syst Saf 94:1135–1148 Kucherenko S, Feil B, Shah N, Mauntz W (2011) The identification of model effective dimensions using global sensitivity analysis. Reliab Eng Syst Saf 96(4):440–449
123
Uncertainty quantification: a minimun variance unbiased... Kucherenko S, Tarantola S, Annoni P (2012) Estimation of global sensitivity indices for models with dependent variables. Comput Phys Commun 183(4):937–946 Kucherenko S, Delpuech B, Iooss B, Tarantola S (2015) Application of the control variate technique to estimation of total sensitivity indices. Reliab Eng Syst Saf 134:251–259 Lamboni M (2016a) Global sensitivity analysis: a generalized, unbiased and optimal estimator of total-effect variance. Stat Pap 59(1):361–386. https://doi.org/10.1007/s00362-016-0768-5 Lamboni M (2016b) Global sensitivity analysis: an efficient numerical method for approximating the total sensitivity index. Int J Uncertain Quant (accepted) Lamboni M, Makowski D, Monod H (2008) Multivariate global sensitivity analysis for discrete-time models. Rapport technique 2008-3. INRA, UR341 Mathématiques et Informatique Appliquées, Jouy-en-Josas, France Lamboni M, Makowski D, Lehuger S, Gabrielle B, Monod H (2009) Multivariate global sensitivity analysis for dynamic crop models. Fields Crop Res 113:312–320 Lamboni M, Monod H, Makowski D (2011) Multivariate sensitivity analysis to measure global contribution of input factors in dynamic models. Reliab Eng Syst Saf 96:450–459 Lamboni M, Iooss B, Popelin AL, Gamboa F (2013) Derivative-based global sensitivity measures: general links with Sobol’ indices and numerical tests. Math Comput Simul 87:45–54 Lehmann EL (1951) Consistency and unbiasedness of certain nonparametric tests. Ann Math Stat 22:165– 179 Lehmann EL (1999) Elements of large sample theory. Springer, New York Mara TA, Joseph OR (2008) Comparison of some efficient methods to evaluate the main effect of computer model factors. J Stat Comput Simul 78(2):167–178. https://doi.org/10.1080/10629360600964454 Mara TA, Tarantola S (2012) Variance-based sensitivity indices for models with dependent inputs. Reliab Eng Syst Saf 107:115–121 Mara TA, Tarantola S, Annoni P (2015) Non-parametric methods for global sensitivity analysis of model output with dependent inputs. Environ Model Softw 72:173–183 Marrel A, Iooss B, Da Veiga S, Ribatet M (2012) Global sensitivity analysis of stochastic computer models with joint metamodels. Stat Comput 22(3):833–847. https://doi.org/10.1007/s11222-011-9274-8 Muehlenstaedt T, Roustant O, Carraro L, Kuhnt S (2012) Data-driven kriging models based on FANOVAdecomposition. Stat Comput 22(3):723–738. https://doi.org/10.1007/s11222-011-9259-7 Oakley JE, O’Hagan A (2004) Probabilistic sensitivity analysis of complex models: a Bayesian approach. J R Stat Soc Ser B (Stat Methodol) 66(3):751–769 Owen AB (2013a) Better estimation of small Sobol’ sensitivity indices. ACM Trans Model Comput Simul 23:111–1117 Owen AB (2013b) Variance components and generalized Sobol’ indices. SIAM/ASA J Uncertain Quant 1(1):19–41 Plischke E, Borgonovo E, Smith CL (2013) Global sensitivity measures from given data. Eur J Oper Res 226(3):536–550 Pujol G, Iooss B, Janon A (2013) sensitivity: sensitivity analysis. R package version 1:7 Rao CR, Kleffe J (1988) Estimation of variance components and applications. North Holland, Amsterdam Ratto M, Pagano A (2010) Using recursive algorithms for the efficient identification of smoothing spline anova models. AStA Adv Stat Anal 94(4):367–388 Ratto M, Pagano A, Young P (2007) State dependent parameter metamodelling and sensitivity analysis. Comput Phys Commun 177(11):863–876 Saltelli A (2002) Making best use of model evaluations to compute sensitivity indices. Comput Phys Commun 145:280–297 Saltelli A, Tarantola S, Chan K (1999) Quantitative model independent methods for global sensitivity analysis of model output. Technometrics 41:39–56 Saltelli A, Chan K, Scott E (2000) Variance-based methods. Probability and statistics. Wiley, New York Saltelli A, Annoni P, Azzini I, Campolongo F, Ratto M, Tarantola S (2010) Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Comput Phys Commun 181(2):259–270 Sobol IM (1993) Sensitivity analysis for non-linear mathematical models. Math Model Comput Exp 1:407– 414 Sobol IM (2001) Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math Comput Simul 55:271–280
123
M. Lamboni Sobol IM (2007) Global sensitivity analysis indices for the investigation of nonlinear mathematical models. Matematicheskoe Modelirovanie 19:23–24 Sobol IM, Kucherenko S (2009) Derivative based global sensitivity measures and the link with global sensitivity indices. Math Comput Simul 79:3009–3017 Storlie CB, Swiler LP, Helton JC, Sallaberry CJ (2009) Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models. Reliab Eng Syst Saf 94(11):1735–1763 Sudret B (2008) Global sensitivity analysis using polynomial chaos expansions. Reliab Eng Syst Saf 93(7):964–979 Sugiura N (1965) Multisample and multivariate nonparametric tests based on u statistics and their asymptotic efficiencies. Osaka J Math 2(2):385–426. http://projecteuclid.org/euclid.ojm/1200691466
123