Stat Papers https://doi.org/10.1007/s00362-018-1012-2 REGULAR ARTICLE
The non-null limiting distribution of the generalized Baumgartner statistic based on the Fourier series approximation Ryo Miyazaki2 · Hidetoshi Murakami1
Received: 12 September 2017 / Revised: 5 March 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018
Abstract The non-null limiting distribution of the generalized Baumgartner statistic is approximated by applying the Fourier series approximation. Due to the development of computational power, the Fourier series approximation is readily utilized to approximate its probability density function. The infinite product part for a noncentral parameter in the characteristic function is re-formulated by using a formula of the trigonometric function. The non-central parameter of the generalized Baumgartner statistic is formulated by the first moment of the generalized Baumgartner statistic under the alternative hypothesis. The non-central parameter is used to calculate the power of the generalized Baumgartner statistic. Keywords Fourier series approximation · Generalized Baumgartner statistic · Non-central weighted χ 2 distribution Mathematics Subject Classification 62G10
1 Introduction Testing hypotheses is one of the most important challenges in nonparametric statistics. Various nonparametric tests have been proposed for one-sample, two-sample and multisample testing problems involving the location, scale, location-scale and other parameters. We consider testing the equality of several parameters, i.e. the one-way
B
Hidetoshi Murakami
[email protected]
1
Department of Applied Mathematics, Tokyo University of Science, Tokyo, Japan
2
Department of Industrial and Systems Engineering, Graduate School of Science and Engineering, Chuo University, Tokyo, Japan
123
R. Miyazaki, H. Murakami
layouts analysis of variance, which is one of the most important types of statistical procedures in biometry. Then the one-way layouts analysis of variance is used in many practices. Let {X pq | p = 1, . . . , k, q = 1, . . . , n p } be k samples of size n p of each observations. Note that independence between the samples is assumed, and the observations within each sample are assumed to be independent and identically distributed. Suppose that the observation X pq is obtained from a continuous distribution function F p (x). Then, we are interested in testing the following hypothesis: H0 : F1 = F2 = · · · = Fk against H1 : not H0 . However, in many applications, the underlying distribution is not adequately understood to assume normality or some other specific distribution. Hence, nonparametric hypothesis testing must be used in these circumstance. Since the means and variances of the each group differ, the tests for the shifted location parameter and the tests for the changed scale parameter may not be appropriate in this situation. Therefore, it is preferable to jointly test for the location and scale differences for a continuous distribution functions. Suppose that the location parameter is a main effect with small scale changes. Under these circumstance, Murakami (2006) proposed a modified Baumgartner statistic (Baumgartner et al. 1998). The powers of the original and modified Baumgartner statistics are almost equivalent to that of the well-known Wilcoxon test (Govindarajulu 2007) for the shifted location parameters. In addition, Baumgartner et al. (1998) and Murakami (2006) showed that the original and modified Baumgartner statistics were more powerful than that of the Wilcoxon, Kolmogorov-Smirnov (Hájek et al. 1999), Cramér-von Mises (Hájek et al. 1999), and Anderson-Darling (Pettitt 1976) tests for the changed scale parameters. Additionally, Murakami et al. (2009) extended the modified Baumgartner statistic for the multisample problem. Let R pq be the increasing-order rank of X pq in the combined N = n 1 + · · · + n k samples. Addi s tionally, note that we can define the ranks as R pq := ks= p n=1 I (X s < X p(q) )+q, where X p(q) denote the qth order statistics in the pth sample. Then the generalized Baumgartner statistic, namely Vk , is as follows:
np k k−1 1 Vk = k np p=1
q q=1 n p +1
R pq −
1−
N +1 n p +1 q
q n p +1
2
(N −n p )(N +1) n p +2
.
Moreover, the characteristic function of the limiting distribution of the Vk statistic is given by ∞ 1− φVk (u) = j=1
2Iu j ( j + 1)
− k−1 2
=
−2π Iu √ π cos( 2 1 + 8Iu)
k−1 2
,
√ where I = −1. Note that the characteristic function φVk (u) is the sum of weighted χ 2 distribution, and the weighted χ 2 random variables are non-central under the alternative hypothesis.
123
The non-null limiting distribution of the generalized Baumgartner...
The distribution of linear combinations of χ 2 random variables and that of quadratic forms in normal vectors have already received a lot of attention in the statistical literature. The sum of weighted χ 2 random variables appears in many important problems in statistics. For example, in the problems for testing against ordered alternatives, the cumulative χ 2 statistic has a good power, cf. Hirotsu (1986); Nair (1986). Then for studying the power function of the cumulative χ 2 statistic, we need to evaluate the distribution function of a sum of weighted non-central χ 2 random variables. In addition, the goodness of fit test statistics based on the empirical cumulative distribution function, such as the Cramér-von Mises statistic or the Anderson-Darling statistic (Anderson and Darling 1952), are infinite sums of weighted χ 2 random variables. Although considering the exact calculation of the distribution of the sum of weighted non-central χ 2 random variables is to be a difficult numerical problem, the cumulative distribution function (CDF) of the sum of weighted χ 2 random variables is required in many applications. Additionally, although there is not known closed-form solution for the CDF, there are many good approximations. For example, Bodenham and Adams (2016) developed a framework to enable a much more extensive comparison between approximate methods for computing the CDF of the sums of an arbitrary weighted random variables. One of the most successful approaches for obtaining the CDF and probability density function of linear combinations of non-central χ 2 random variables is representation in terms of Laguerre series as in Mathai and Provost (1992). In fact, Laguerre series expansions play a very important role in the subject of approximation of CDFs, see Tan and Tiku (1999). Recently, Ha and Provost (2013) provided an accessible methodology for approximating the distribution of a general linear combination of non-central χ 2 random variables. They proposed a moment-based approximation of the density function of a positive definite quadratic form, which consists of a gamma density function that is adjusted by a linear combination of Laguerre polynomials or, equivalently, by a single polynomial. Various statistical inferences lead to the problem of evaluating the probability of the sum of weighted χ 2 random variables. Computational methods include numerical inversion of the characteristic function of the sum of weighted χ 2 variables for various moments-based approximations were reviewed in Solomon and Stephens (1977). Furthermore, Gabler and Wolff (1987) presented the theoretical approach of an approximation and an easily implementable algorithm. A review of the current state for the sum of weighted non-central χ 2 random variables can be found in Duchesne and Lafaye De Micheaux (2010), and methods for computing the CDF of a single noncentral χ 2 random variable are described in Farebrother (1987); Ding (1992); Penev and Raykov (2000). For conducting other distributions, Kamps (1990) characterized the exponential distribution by means of the distribution of a weighted sum of independent, identically distributed random variables. In addition, Tank and Eryilmaz (2015) considered the distributions of sum, minima and maxima of generalized geometric random variables. Moreover, the sum of independent albeit not necessarily identical uniformly distributed random variables arises naturally in the aggregation of scaled values with differing numbers of significant figures. Then Sadooghi-Alvandi et al. (2009) found this distribution by employing a Laplace transform, also seemingly utilizing beforehand knowledge of the result. Furthermore, Potuschak and Müller (2009) provided
123
R. Miyazaki, H. Murakami
a simplified derivation of the density of the sum of independent non-identically distributed uniform random variables via an inverse Fourier transform. However, we focus on the approximation to the distribution of the sum of weighted non-central χ 2 random variables. Approximation methods are widely used and have been studied extensively. From a practical view, approximations are typically precise and straightforward to implement in various statistical software programs. Hence, obtaining a more accurate approximation for evaluating the density or the distribution function remains an important area of debate in statistics. In this paper, we use the Fourier series approximation to evaluate the distribution of the Vk statistic under H1 . The Fourier series approximation is easy to use and simple to program. In addition, the Fourier series approximation is a well-known method to accurately approximate the tail probability. Herein, we introduce the definition of the Fourier series approximation. Definition 1 (Fourier series approximation) Let f (x) be a continuous real value function with the period L. Then a function f (x) is approximated by f (x) =
∞ 2nπ x 2nπ x a0 an cos + + bn sin , 2 L L n=1
where an =
2 L
L 2
− L2
f (x) cos
2nπ x L
d x, bn =
2 L
L 2
− L2
f (x) sin
2nπ x L
d x.
The rest of this paper is organized as follows: In Sect. 2, the Fourier series approximation is applied to approximate the non-null limiting distribution of the Vk statistic. In Sect. 3, we calculate the probability of the non-null limiting distribution of the Vk statistic and derive the non-central parameter of the Vk statistic. Finally, conclusions are provided in Sect. 4.
2 Fourier series approximation In this section, we consider the Fourier series approximation to the non-null limiting distribution of the Vk statistic. The non-null limiting distribution of the Vk statistic is a sum of weighted non-concentrated chi-square distribution, thus its characteristic function is defined by φV∗k (u, δ)
=
∞
exp
j=1
δI u j ( j+1) Iu 1 − j (2j+1)
2Iu 1− j ( j + 1)
− k−1 2
k−1 √ 2 π δIu tan( π2 1 + 8Iu) δ −2π Iu + = exp , (1) √ √ 2 1 + 8Iu cos( π2 1 + 8Iu)
123
The non-null limiting distribution of the generalized Baumgartner...
where δ ≥ 0 denotes a non-central parameter. For the derivation of (1), see Appendix. Note that (1) is the characteristic function of the sum of weighted chi-square distribution when δ = 0. Considering the exact calculation of the distribution of the sum of weighted non-central χ 2 random variables is to be a difficult numerical problem (see, e.g. Castaño-Martínez and López-Blázquez 2005). Although the moment generating function is explicitly given, its Fourier inversion to evaluate the probability density function (PDF) and the CDF is difficult as extensively discussed by Tanaka (1996). Therefore, the non-null limiting distribution of the Vk statistic based on the Fourier series approximation is given by Theorem 1. Theorem 1 The non-null limiting PDF of the Vk statistic based on the Fourier series approximation is given by M nπ I nπ t eνt 1 ∗ ∗ Re Lk ν + L (ν; δ) + ; δ cos f Vk (t; ν, T, M, δ) ≈ T 2 k T T n=1 nπ I nπ t ; δ sin , −Im L∗k ν + T T where ν, T, M > 0. Proof (Proof of Theorem 1) We follow a procedure like that of Ha (2012) to prove Theorem 1 more precisely. It appears convenient to work with the Laplace transform to (1). Then we have L∗k (t; δ) = E e−t Vk =
∞ 0
e−t x f Vk (x)d x
k−1 √ 2 π tδ tan( π2 1 − 8t) δ 2π t = exp − , √ √ 2 1 − 8t cos( π2 1 − 8t) where f Vk is PDF of the sum of weighted non-central χ 2 distribution. By inverting Laplace transform, we obtain f Vk (t) =
1 2π I
ν+I ∞
ν−I ∞
est L∗k (s; δ)ds.
Note that eI x = cos x + I sin x and put s = ν + Iu, u ∈ R. Then we can express as
∞ 1 e(ν+I u)t L∗k (ν + Iu; δ)du 2π −∞
eνt ∞ = (cos ut + I sin ut)L∗k (ν + Iu; δ)du. 2π −∞
f Vk (t) =
123
R. Miyazaki, H. Murakami
Since f Vk (t) is a real value function, f Vk (t) is rewritten by eνt f Vk (t) = 2π
∞ −∞
∗ ∗ Re Lk (ν + Iu; δ) cos(ut) − Im Lk (ν + Iu; δ) sin(ut) du.
In addition, the Fourier series method is replaced by the Poisson summation to bound the discretization error in association with the trapezoidal rule. Let u = jπ/T and we replace the integral symbol with the infinite series with a discrete interval π/T . Then, we obtain f Vk (t) ≈
∞ jπ jπ eνt Re L∗k ν + I ; δ cos t 2T T T j=−∞ jπ jπ ; δ sin t . −Im L∗k ν + I T T
(2)
Finally, the non-null limiting distribution of the Vk statistic is derived by using Lemma 1.
Lemma 1 When the period is L = 2T , an and bn in Fourier series approximation are respectively as follows: nπ I nπ I ∗ ∗ ; δ , bn ≈ −2Im Lk ν + ;δ . an ≈ 2Re Lk ν + T T
Proof (Proof of Lemma 1) Firstly we note that Re L∗k (ν + cI; δ) = Re L∗k (ν − cI; δ) , Im L∗k (ν + cI; δ) = −Im L∗k (ν − cI; δ)
(3) (4)
for all c ∈ R, δ ≥ 0 and assume a sufficiently large M < ∞. Suppose that the period is L = 2T , and by using (3) and an orthogonality of the trigonometric function, an is given by 2 an = 2T
T
∞ jπ jπ ∗ Re Lk ν + I ; δ cos t T T
−T j=−∞
jπ jπ 2nπ t ; δ sin t cos dt − Im L∗k ν + I T T 2T
M 1 T jπ jπ ∗ ≈ Re Lk ν + I ; δ cos t T −T T T j=−M jπ jπ nπ t ; δ sin t cos dt − Im L∗k ν + I T T T
123
The non-null limiting distribution of the generalized Baumgartner...
nπ t nπ I ; δ cos2 Re L∗k ν + T T −T nπ I −nπ t nπ t ∗ ; δ cos cos dt + Re Lk ν − T T T
nπ I nπ t 2 T ; δ cos2 dt = Re L∗k ν + T −T T T nπ I = 2Re L∗k ν + ;δ . T =
1 T
T
Similarly, by (4) and an orthogonality of the trigonometric function, bn is as follows: ∞ jπ jπ ∗ Re Lk ν + I ; δ cos t T T −T j=−∞ jπ jπ 2nπ t ; δ sin t sin dt − Im L∗k ν + I T T 2T
M 1 T jπ jπ ∗ ≈ Re Lk ν + I ; δ cos t T −T T T j=−M jπ jπ nπ t ; δ sin t sin dt − Im L∗k ν + I T T T
nπ I 1 T ∗ 2 nπ t Im Lk ν + ; δ sin =− T −T T T nπ I −nπ t nπ t ; δ sin sin dt + Im L∗k ν − T T T
nπ I 2 T ∗ 2 nπ t ; δ sin dt =− Im Lk ν + T −T T T nπ I = −2Im L∗k ν + ;δ . T
2 bn = 2T
T
Remark 1 Theorem 1 contains the null limiting distribution, thus for the case of δ = 0.
3 Power of the generalized Baumgartner statistic In this section, we investigate the probability of the generalized Baumgartner statistic under the alternative hypothesis. In this paper, we assume T = 50, M = 500 and ν = 0.01 as a similar setting with Ha (2012). In addition, we investigate M = 1000 and 1500. Under the null hypothesis, i.e. δ = 0, we list the estimated critical value of the Vk statistic in Table 1. For k = 3, Murakami et al. (2009) derived the limiting distribution of the generalized Baumgartner statistic as follows:
123
R. Miyazaki, H. Murakami Table 1 The estimated critical value of the Vk statistic under H0 Nominal level k
M
3
5
7
9
0.100
0.050
0.025
0.010
Ψ3
3.399335
4.093882
4.787376
5.703762
500
3.399330
4.093876
4.787362
5.703702
1000
3.399335
4.093882
4.787376
5.703762
1500
3.399335
4.093882
4.787376
5.703762
500
6.020761
6.887265
7.723293
8.797510
1000
6.020761
6.887265
7.723293
8.797510
1500
6.020761
6.887265
7.723293
8.797510
500
8.482154
9.476057
10.419480
11.614688
1000
8.482154
9.476057
10.419480
11.614688
1500
8.482154
9.476057
10.419480
11.614688
500
10.865561
11.965714
12.999355
14.296979
1000
10.865561
11.965714
12.999355
14.296979
1500
10.865561
11.965714
12.999355
14.296979
Ψ3 := P(V3 ≥ b) = 2
∞ 2π 3 b (2 j + 1)2 π 2 j − . (−1) (2 j + 1) exp b3 8 2b j=0
However, since there exists the infinite summation, we approximate the limiting distribution of the V3 statistic as j = 0, . . . , 100,000. Additionally, we list the critical value of the approximated Ψ3 which is more accurate value than that of Murakami et al. (2009) because Murakami et al. (2009) treated j = 0, . . . , 3 as same as Baumgartner et al. (1998). Secondly, we list the non-central parameters for various powers for the 5% significance level when k = 3, 5, 7 and 9 in Table 2. For example, the value of the non-central parameter is δ = 4.170450 when P(V3 ≤ 4.093882) = 0.8. From Table 1, we also assume T = 50, M = 1000 and ν = 0.01 for these calculations. In addition, we list the required computation time for calculation. It is important how to decompose the non-central parameter δ with parameters of distribution. Then, we derive a non-central parameter of the Vk statistic. Note that the non-central parameters depend on the rank of each sample. Therefore, the non-central parameters are given by following Lemma. Lemma 2 The non-central parameter of the Vk statistic is as follows: δ = E(Vk ) − k + 1 np np np k R 2pq R 2pq N +1 C1 ( p) E E[R pq ] + E = −2 q np + 1 np + 1 − q p=1
123
q=1
q=1
q=1
The non-null limiting distribution of the generalized Baumgartner... Table 2 The non-central parameter of the Vk statistic for various powers
α = 0.05 for samples k Power 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
3
5
7
9
0.417146
0.585471
0.712661
0.819192
(5.234 s)
(5.234 s)
(5.219 s)
(5.219 s)
1.040196
1.414933
1.695895
1.930733
(5.219 s)
(5.219 s)
(5.219 s)
(5.188 s)
1.553606
2.079391
2.472119
2.799967
(5.234 s)
(5.219 s)
(5.234 s)
(5.219 s)
2.026686
2.683696
3.173085
3.581178
(5.234 s)
(5.219 s)
(5.250 s)
(5.203 s)
2.492972
3.274844
3.855841
4.339803
(5.250 s)
(5.219 s)
(5.203 s)
(5.203 s)
2.979184
3.888312
4.562330
5.123146
(5.219 s)
(5.219 s)
(5.203 s)
(5.266 s)
3.518317
4.566431
5.341685
5.985936
(5.234 s)
(5.172 s)
(5.250 s)
(5.203 s)
4.170450
5.385073
6.281194
7.024804
(5.219 s)
(5.234 s)
(5.219 s)
(5.188 s)
5.106115
6.558547
7.626697
8.511310
(5.203 s)
(5.219 s)
(5.203 s)
(5.203 s)
np q R pq N +1 E np + 1 np + 1 − q q=1 np 2 − k + 1, + (N + 1) HN[n p ] − np + 1
−2
where np
E[R pq ] =
q=1
E np
q=1
k
ns n p
s= p
1 Fs d F p + n p (n p + 1), 2
k R pq Fs n = ns 1 − F p p d F p + (n p + 1)HN[n p ] − n p , np + 1 − q 1 − Fp s= p np
k R 2pq 1 E n s n p Fs d F p = n p (n p + 1) + 2 q 2 q=1
+
k s= p
ns
s= p
Fs 1 − (1 − F p )n p d F p Fp
123
R. Miyazaki, H. Murakami
+
k
s= p
+
np
E
q=1
R 2pq np + 1 − q
k k
Fs Fw 1 − (1 − F p )n p d F p , Fp
ns nw
s= p w=s
Fs2 1 − (1 − F p )n p d F p Fp
n s (n s − 1)
= (n p + 1)2 HN[n p ] + 2
k s= p
3 − n p (n p + 1) + 3 2 +
k
n s (n s − 1)
s= p
+
k k
ns nw
s= p w=s
k
ns n p
ns
s= p
Fs F p n −1 1 − Fp p d Fp 1 − Fp
Fs n 1 − Fp p d Fp 1 − Fp
Fs2 n 1 − Fp p d Fp 1 − Fp Fs Fw n 1 − Fp p d Fp , 1 − Fp np
1 (n p + 1)(n p + 2) k−1 C1 ( p) = · , HN[n p ] = . k n p (N − n p )(N + 1) q q=1
Proof (Proof of Lemma 2) At first, we rewrite the Vk statistic as follows: 2 N +1 np k 2 R − q pq (n p + 1) (n p + 2) n p +1 k−1 Vk = k n (N − n p )(N + 1) q(n p + 1 − q) p=1 p q=1 ⎧ 2 N +1 np ⎪ k ⎨ R − q pq (n p + 1)(n p + 2) n p +1 k−1 · = ⎪ k n p (N − n p )(N + 1) q p=1 q=1 ⎩ ⎫ 2 ⎪ ⎬ R pq − nNp+1 +1 q + np + 1 − q ⎪ ⎭ =
k
C1 ( p)
p=1
np
R 2pq
q=1 np
q
−2
np np R 2pq N +1 R pq + np + 1 np + 1 − q
N + 1 q R pq −2 np + 1 np + 1 − q q=1
where
123
q=1
q=1
2 + (N + 1) HN[n p ] −
np np + 1
,
The non-null limiting distribution of the generalized Baumgartner... np
1 (n p + 1)(n p + 2) k−1 · , HN[n p ] = . k n p (N − n p )(N + 1) q
C1 ( p) =
q=1
Herein, we derive a first moment of the Vk statistic under the alternative hypothesis. Let f (q) be the probability density function of qth order statistics. The rank can be expressed by an indicator function I (·), thus some moments of rank are as follows: np
E[R pq ]
q=1
=
np
⎡ E⎣
ns k
np
ns k q=1 s= p =1
= =
1 P(X s < X p(q) ) + n p (n p + 1) 2
n p k ns
∞
n p k ns
k
ns n p
s= p
=
k
Fs
np E
Fs
x
ns n p
R pq np + 1 − q =
np q=1
np np − 1
q−1
Fp
1 (1 − F p )n p −q d F p + n p (n p + 1) 2
⎡ ⎤ ns k 1 E⎣ I (X s < X p(q) ) + q ⎦ np + 1 − q
ns k q=1 s= p =1
=
q −1
1 Fs d F p + n p (n p + 1), 2
np
=
1 f s (y) f (q) (x)d yd x + n p (n p + 1) 2
n p! 1 q−1 F p (1 − F p )n p −q d F p + n p (n p + 1) (q − 1)!(n p − q)! 2
q=1
s= p
q=1
q=1 s= p =1 −∞ −∞
q=1 s= p =1
=
I (X s < X p(q) ) + q ⎦
s= p =1
q=1
=
⎤
s= p =1
P(X s < X p(q) ) + (n p + 1)HN[n p ] − n p np + 1 − q
n p k ns q=1 s= p =1
Fs
n p! q−1 F p (1 − F p )n p −q d F p (q − 1)!(n p − q + 1)!
+ (n p + 1)HN[n p ] − n p
123
R. Miyazaki, H. Murakami
=
k
ns
s= p
np Fs n p q−1 F p (1 − F p )n p −q+1 d F p 1 − Fp q −1 q=1
+ (n p + 1)HN[n p ] − n p
k Fs n = ns (1 − F p p )d F p + (n p + 1)HN[n p ] − n p , 1 − Fp s= p
np
E
2 R pq q
q=1
⎡⎛ ⎞2 ⎤ np k ns 1 ⎢⎝ ⎥ E⎣ I (X s < X p(q) ) + q ⎠ ⎦ = q s= p =1
q=1
=
n p k ns 1 n p (n p + 1) + 2 P(X s < X p(q) ) 2 q=1 s= p =1 ⎡ ⎧ ⎫2 ⎤ np ns k ⎬ ⎨ ⎢1 ⎥ + E⎣ I (X s < X p(q) ) ⎦ ⎭ q⎩ s= p =1
q=1
1 = n p (n p + 1) + 2 ns n p 2 k
s= p
+
ns =h
+
np q=1
=
np k ns 1 E I (X s < X p(q) ) Fs d F p + q s= p
q=1
=1
I (X s < X p(q) )I (X sh < X p(q) ) ⎫⎤ ⎧ ns nw k ⎨ k ⎬ 1 E⎣ I (X s < X p(q) ) I (X wr < X p(q) ) ⎦ ⎭ ⎩ q ⎡
s= p w=s
1 n p (n p + 1) + 2 2
r =1
=1
k
np
Fs d F p +
ns n p
s= p
ns k P(X s < X p(q) ) q
q=1 s= p =1
np
+
ns k P(X s < X p(q) , X sh < X p(q) )
q ⎫⎤ ⎧ np ns nw k k ⎨ ⎬ 1 E⎣ I (X s < X p(q) , X wr < X p(q) ) ⎦ + ⎭ ⎩ q q=1 s= p =h
⎡
q=1
=
s= p w=s
1 n p (n p + 1) + 2 2
=1 r =1
k
ns n p
s= p
ns ∞ k np
+
123
q=1 s= p =h −∞
Fs2
Fs d F p +
k s= p
ns
Fs 1 − (1 − F p )n p d F p Fp
n p! q−1 F p (1 − F p )n p −q d F p q!(n p − q)!
The non-null limiting distribution of the generalized Baumgartner... np k ns nw k P(X s < X p(q) , X wr < X p(q) ) q
+
q=1 s= p w=s =1 r =1
1 n p (n p + 1) + 2 ns n p 2 k
=
Fs d F p +
s= p
+
+
n p k ns ∞
Fs2
q=1 s= p =h −∞ np k ns nw k
s= p
k
+
s= p
Fs Fw
q=1 s= p w=s =1 r =1
1 ns n p n p (n p + 1) + 2 2 k
n s (n s − 1)
s= p k k
+
ns nw
s= p w=s
np
E
q=1
=
Fs2 1 − (1− F p )n p d F p Fp
n p! q−1 F p (1 − F p )n p −q d F p q!(n p − q)!
Fs d F p +
s= p
+
n s (n s −1)
Fs 1 − (1 − F p )n p d F p Fp
ns
s= p np k ns nw k
k
k
Fs d F p +
s= p
=
Fs 1 − (1 − F p )n p d F p Fp
x
1 ∞ x f s (y) f w (z) f (q) (x)dzdyd x q −∞ −∞ −∞ q=1 s= p w=s =1 r =1
1 n p (n p + 1)+2 ns n p 2
+
ns
n p! q−1 F p (1 − F p )n p −q d F p q!(n p − q)!
k
=
k
Fs2 Fp
k
ns
s= p
Fs 1 − (1 − F p )n p d F p Fp
{1 − (1 − F p )n p }d F p
Fs Fw {1 − (1 − F p )n p }d F p , Fp
R 2pq np + 1 − q
np q=1
⎡⎛ ⎞2 ⎤ ns k 1 ⎥ ⎢ I (X s < X p(q) ) + q ⎠ ⎦ E ⎣⎝ np + 1 − q s= p =1
3 = (n p + 1)2 HN[n p ] − n p (n p + 1) + 2 ns n p 2 k
s= p
+2
k s= p
np
+
q=1
ns ⎡
⎢ E⎣
Fs F p n p −1 1 − Fp d Fp 1 − Fp
Fs np 1 − Fp d Fp 1 − Fp ⎧ ns k ⎨
1 np + 1 − q ⎩
s= p =1
⎫2 ⎤ ⎬ ⎥ I (X s < X p(q) ) ⎦ ⎭
123
R. Miyazaki, H. Murakami
3 = (n p + 1)2 HN[n p ] − n p (n p + 1) + 2 ns n p 2 k
s= p
+2
k
ns
s= p np + E
+
q=1 ns =h
+
np q=1
Fs np 1 − Fp d Fp 1 − Fp
ns k 1 I (X s < X p(q) ) np + 1 − q s= p
=1
I (X s < X p(q) )I (X sh < X p(q) ) ⎫⎤ ⎧ ns nw k k ⎨ ⎬ 1 E⎣ I (X s < X p(q) ) I (X wr < X p(q) ) ⎦ ⎭ ⎩ np + 1 − q ⎡
s= p w=s
k
ns
s= p np k
r =1
=1
3 = (n p + 1)2 HN[n p ] − n p (n p + 1) + 2 2 +3
Fs F p n p −1 1 − Fp d Fp 1 − Fp
k
ns n p
s= p
Fs F p n p −1 1 − Fp d Fp 1 − Fp
Fs np 1 − Fp d Fp 1 − Fp
ns P(X s < X p(q) , X sh < X p(q) ) np + 1 − q q=1 s= p =h ⎫⎤ ⎧ ⎡ np ns nw k k ⎨ ⎬ 1 E⎣ I (X s < X p(q) , X wr < X p(q) ) ⎦ + ⎭ ⎩ np + 1 − q
+
s= p w=s
q=1
=1 r =1
3 = (n p + 1)2 HN[n p ] − n p (n p + 1) + 2 ns n p 2 k
s= p
+3
k
ns
s= p
+
+
Fs np 1 − Fp d Fp 1 − Fp
n p k ns
Fs2
n p! q−1 F p (1 − F p )n p −q d F p (q − 1)!(n p − q + 1)!
q=1 s= p =h np k ns nw k
q=1 s= p w=s =1 r =1
P(X s < X p(q) , X wr < X p(q) ) np − q + 1
3 = (n p + 1)2 HN[n p ] − n p (n p + 1) + 2 ns n p 2 k
s= p
+3
k s= p
123
ns
Fs F p n p −1 1 − Fp d Fp 1 − Fp
Fs F p n p −1 1 − Fp d Fp 1 − Fp
k Fs2 Fs np np n s (n s − 1) 1 − Fp d Fp + 1 − Fp d Fp 1 − Fp 1 − Fp s= p
The non-null limiting distribution of the generalized Baumgartner... np k ns nw k
+
q=1 s= p w=s =1 r =1
Fs Fw
n p! q−1 F p (1 − F p )n p −q d F p (q − 1)!(n p − q + 1)!
3 = (n p + 1)2 HN[n p ] − n p (n p + 1) + 2 ns n p 2 k
s= p
+3
k
ns
s= p k k
+
Fs F p n p −1 1 − Fp d Fp 1 − Fp
k Fs2 Fs np np (1 − F p )d F p + n s (n s − 1) 1 − Fp d Fp 1 − Fp 1 − Fp s= p
ns nw
s= p w=s
Fs Fw np 1 − Fp d Fp . 1 − Fp
4 Conclusion and discussion In this paper, we derived the non-null limiting distribution of the generalized Baumgartner statistic by using the Fourier series approximation. The infinite product part for a non-central parameter in the characteristic function was re-formulated by using a formula of the trigonometric function. More precisely, the part of non-central parameter is represented by a tangent function. In addition, we formulated to derive the non-central parameter of the generalized Baumgartner statistic based on the first moment of the generalized Baumgartner statistic under the alternative hypothesis. Furthermore, we used the non-central parameter to calculate the asymptotic power of the generalized Baumgartner statistic for various cases. In this paper, we approximated the infinite summation with a sufficiently large finite summation. Then we exchanged the integral for the infinite summation. As a future work, we have to confirm the uniformly convergence of function to obtain more accurate approximation for evaluating the density of the distribution functions.
Appendix We derive that the characteristic function converges to Eq. (1). First, we introduce a famous formula for the partial fraction expansion of trigonometric function as follows: ∞ k=1
πx 1 π . = − tan x 2 − (2k − 1)2 4x 2
(5)
The characteristic function is rewritten as
123
R. Miyazaki, H. Murakami
− k−1 2 2Iu 1− exp j ( j + 1) j=1 ⎛ ⎞ − k−1 ∞ ∞ 2 2Iu δIu ⎝ ⎠ 1− = exp . j ( j + 1) − 2Iu j ( j + 1) ∞
δI u j ( j+1) Iu 1 − j (2j+1)
j=1
j=1
Herein we focus on the first term. Then we have ∞ j=1
∞
δIu 1 = δIu j ( j + 1) − 2Iu j ( j − 1) − 2Iu j=2 ∞
= 4δIu
1 4 j 2 − 4 j − 8Iu + 1 − 1
j=2 ∞
= −4δIu
j=2
⎡
1 1 + 8Iu − (2 j − 1)2
⎤ 1 1 1 ⎦ − = −4δIu ⎣ + 1 + 8Iu − (2 j − 1)2 8Iu 8Iu j=2 ⎡ ⎤ ∞ 1 1 ⎦ − = −4δIu ⎣ . (6) 1 + 8Iu − (2 j − 1)2 8Iu ∞
j=1
By applying (5)–(6), we obtain √ π √ δ π δIu tan( π2 1 + 8Iu) 1 = + 1 + 8Iu − tan . −4δIu − √ √ 2 8Iu 2 4 1 + 8Iu 1 + 8Iu
π
Therefore, the characteristic function consists to (1).
References Anderson TW, Darling DA (1952) Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann Math Stat 23:193–212 Baumgartner W, Weiß P, Schindler H (1998) A nonparametric test for the general two-sample problem. Biometrics 54:1129–1135 Bodenham DA, Adams NM (2016) A comparison of efficient approximations for a weighted sum of chisquared random variables. Stat Comput 26:917–928 Castaño-Martínez A, López-Blázquez F (2005) Distribution of a sum of weighted noncentral chi-square variables. TEST 14:397–415 Ding CG (1992) Algorithm AS 275: computing the non-central χ 2 distribution function. J R Stat Soc Ser C 41:478–482 Duchesne P, Lafaye De Micheaux P (2010) Computing the distribution of quadratic forms: further comparisons between the Liu-Tang-Zhang approximation and exact methods. Comput Stat Data Anal 54:858–862
123
The non-null limiting distribution of the generalized Baumgartner... Farebrother RW (1987) Algorithm AS231: the distribution of a noncentral χ 2 variable with nonnegative degrees of freedom. J R Stat Soc Ser C 17:402–405 Gabler S, Wolff C (1987) A quick and easy approximation to the distribution of a sum of weighted chi-square variables. Stat Hefte 28:317–325 Govindarajulu Z (2007) Nonparametric inference. World Scientific Publishing, New Jersey Ha HT (2012) Fourier series approximation for the generalized Baumgartner statistic. Commun Korean Stat Soc 19:451–457 Ha HT, Provost SB (2013) An accurate approximation to the distribution of a linear combination of noncentral chi-square random variables. REVSTAT—Stat J 11:231–254 Hájek J, Sidák Z, Sen PK (1999) Theory of rank tests, 2nd edn. Academic Press, San Diego Hirotsu C (1986) Cumulative chi-square statistic as a tool for testing goodness of fit. Biometrika 73:165–173 Kamps U (1990) Characterizations of the exponential distribution by weighted sums of iid random variables. Stat Pap 31:233–237 Mathai AM, Provost SB (1992) Quadratic forms in random variables: theory and applications, statistics: a series of textbooks and monographs (Book 126). CRC Press, Boca Raton Murakami H (2006) A k-sample rank test based on modified Baumgartner statistic and its power comparison. J Japanese Soc Comput Stat 19:1–13 Murakami H, Kamakura T, Taniguchi M (2009) A saddlepoint approximation to the limiting distribution of a k-sample Baumgartner statistic. J Japanese Stat Soc 39:133–141 Nair VN (1986) On testing against ordered alternatives in analysis of variance models. Biometrika 73:493– 499 Penev S, Raykov T (2000) A wiener germ approximation of the noncentral chi square distribution and of its quantiles. Comput Stat 15:219–228 Pettitt AN (1976) A two-sample Anderson-Darling rank statistic. Biometrika 63:161–168 Potuschak H, Müller WG (2009) More on the distribution of the sum of uniform random variables. Stat Pap 50:177–183 Sadooghi-Alvandi SM, Nematollahi AR, Habibi R (2009) On the distribution of the sum of independent uniform random variables. Stat Pap 50:171–175 Solomon H, Stephens MA (1977) Distribution of a sum of weighted chi-square variables. J Am Stat Assoc 72:881–885 Tan WY, Tiku ML (1999) Sampling distributions in terms of Laguerre polynomials with applications. New Age International Publishers, New Delhi Tanaka K (1996) Time series analysis: nonstationary and noninvertible distribution theory. Wiley, New York Tank F, Eryilmaz S (2015) The distributions of sum, minima and maxima of generalized geometric random variables. Stat Pap 56:1191–1203
123