ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS* BY KINSAKU T ~ N O (Received June 1, 1954) S a m z tary
I1~ P a r t I the Khintchine's uniqueness t h e o r e m for t h e class convergence of probability distributions is. proved in a natural way by m a k i n g use of inverses of distribution functions ; its generalization to t h e multidimensional case is also proved; relations b e t w e e n different paired sequences of scaling constants and c e n t e r i n g constants in limit problems of probability distributions are given ; and t h e general method to determ i n e scaling constants and c e n t e r i n g constants is presented. In P a r t II both an analytical derivation Qf t h e P. Levy's canonical form of t h e infinitely divisible multi-dimensional probability distribution and a necessary and sufficient condition, for t h e distributions of s u m s of asymptoticaly uniformly negligible independent multi-dimensional random variables to converge to a given infinitely divisible probability distribution are given. The logarithms of non-vanishing characteristic functions are t r e a t e d rigorously. In P a r t III various versions of t h e multi-dimensional central limit t h e o r e m on sums of independent random variables are studied. The results in t h e last two parts are extensions of t h e known facts in t h e one-dimensional case to t h e multi-dimensional case.
Introduction L e t G~ and G~ be two p-variate distribution functions. If t h e r e are a positive n u m b e r a and a p-dimensional vector b such t h a t G1(x)--G~(az+b) for all 9 ~ R~, t h e n G~ and G2 are said to belong to t h e s a m e class. L e t us denote by K[G] t h e class containing a distribution function G. In t h e theory of probability it occurs very often, t h a t for a given sequence of p-dimensional random variables {S~I, t h e r e exist a sequence of positive n u m b e r s {a~l and a sequence of vectors {b~t such t h a t t h e 9 Most of the results in the Part I of this paper have been given in the writer's previous papers: On the convergence of classes of distributions, Ann, Inst: Statist. Math., Tokyo, 3, 7-15 (1951); A metrization of class-convergence of distributions, loc. cir., 5, 1-7 (1953); On the many-dimensional distribution functions, loc. cir., 5, 41-58 (1953).
38
Kn~saKu T~ANO
sequence of t h e distributions of Snlan-b~ converges to some distribution. L e t Fn be t h e distribution function of ~?~ and let G be t h e distribution function of t h e limiting dis~ibution. Then t h e distribution function of S~/a~--b~ is given by Fn(a~+a~b~) and we have (1)
lim F~(a~x+ a~b~)= G(x),
at every continuity point of G(z). U n d e r these circumstances, for any positive n u m b e r a and any ~-dimensional vector b it holds t h a t lira Fn(a~ax + a~ab+ a~b~)= G(a~ + ab) at every point of continuity of G(ax+ab). Thus, in limit problems of probability distributions, limit classes r a t h e r than limit distributions appear. When (1) holds t h e sequence of classes fK[F~] } will be said to converge to t h e class K[G], al, a~... will be called scaling constants,* and b, b~,.., will be called centering constants or centering vectors. A limit problem of probability distributions is always a limit problem of classes. In limit problems of probability distributions we have interests in (i) t h e uniqueness of t h e limit class of a convergent sequence of classes, (ii) relations b e t w e e n different paired sequences of scaling constants and c e n t e r i n g constants, and (iii) a general m e t h o d to d e t e r m i n e sequences of scaling constants and c e n t e r i n g constants. (i) was first proved by A. Khintchine [14] in t h e one-dimensional case, (ii) is known in t h e onedimentional case, and (iii) has been t r e a t e d u n d e r m o r e or less restrictive conditions. In P a r t I of this paper a simple n a t u r a l proof for t h e Khintchine's uniqueness t h e o r e m is given by m a k i n g use of inverses of distribution functions, a unified t r e a t m e n t of (ii) is given, (iii) is researched w i t h no restrictive conditions, and f u r t h e r m o r e t h e extensions of these results to t h e multi-dimensional case are also presented. In t h e s t u d y of limit distributions of sums of independent random variables, it is n a t u r a l to p u t t h e condition of asymptotic uniform negligibility, and u n d e r this condition t h e limit distributions are proved to be infinitely divisible. Hence it is useful to investigate t h e conditions for t h e sequence of distributions of normalized s u m s of independent random variables to converge to a given infinitely divisible distribution. Although considerable attensions have been paid to this problem in t h e one-dimensional case, t h e r e are f e w known in t h e multi-dimensional case a~, a n.... are called normalizing factors (Normierungsfaktoren) by W. Feller [6].
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
39
except for the P. L6vy's canonieaI form of the infinitely divisible distribution. According to P. Ldvy [16], w the logarithm of the characteristic function of a infinitely divisible p-dimensional probability distribution is given by
,(
(2)
f (r
1--4"~/'~'~,~du
%
where a is a p-dimensional vector, a is a non-negative definite matrix of ~vth order, and ~ is a p-dimensional measure with Izl>1
I.rl
In (2), t, a, and ~ denote column vectors,
i.e.,
t=
('i) i
t'= ( t , . . . ,
,
t~),
a--
and x' their transposes,
(j
(:) :
t', a',
X----- :
,
a'=(a,..., a,),
,
X'=(XD...., xp),
the usual matrix notation is used, and I xl denotes Euclidean length of x. This canonical form was found from the point of view of the theory of the additive process. P a r t II of this paper gives an analytiea ! derivation of (2) and a necessary and sufficient condition for the sequence of distributions of sums of asymptoticaly uniformly negligible independent p-dimensional random variables to converge to a given infinitely divisible distribution 9 Our method follows M. Lodve [17]* in the one-dimensional case. Following A9 Khintchine C13] let us put tt(E)=J"
~'____~xdu, 1 + x'x
then (2) is r e w r i t t e n as
(a)
f
1
it'x '~ l + , ' x d
%with
*' Japanese readers may refer to the appendix of. Y. Kawada [12], an exposition of M. Lo~ve [17] by the present writer.
40
Kr~SAKU TAKANO
Analytically (3) is p r e f e r a b l e to (2) as p has finite total m e a s u r e while v m a y have infinite total m e a s u r e . The point, in which t h e multidimensional i case differs from t h e one-dimensional case, is t h a t t h e lutegrand in t h e r i g h t side of (3) has no d e t e r m i n e d limit as 9 t e n d s to 0. P a r t III of this p a p e r i n v e s t i g a t e s tlie various versions of t h e multi-dimensional central limit theorem, which s t a t e s t h a t t h e s u m of a s y m p t o t i e a i l y Uniformly negligible i n d e p e n d e n t random variables, u n d e r a p p r o p r i a t e restrictions, is n e a r l y n o r m a l l y d i s t r i b u t e d . The g e n e r a l convergence t h e o r e m in P a r t II and t h e knowledge on scaling constants and c e n t e r i n g constants in P a r t I a r e appiied to this problem. The multi-dimensional c e n t r a l limit t h e o r e m has been t r e a t e d b y H. Gramdr [ 1 ] , C . G . Essen [5], and W. Hoeffding & H. Robins [10] etc., b u t t h e l i t e r a t u r e Which deals w i t h t h e c o m p l e t e generalization of t h e Wellk n o w n versions in t h e one-dimensional case s e e m s to be s c a n t y . ~ On various versions o f t h e c e n t r a l timit t h e o r e m in t h e one-dimensional case, r e a d e r s m a y r e f e r to W. Feller [8] and M. L o i r e [18]. Acknowledgement The w r i t e r wish to express his thanks to Professor Tatsuo K a w a t a and Professor Kiyonori K u n i s a w a for t h e i r i n t e r e s t and e n c o u r a g e m e n t w h i c h g r e a t e l y aided t h e work. Thanks a r e also d u e to Professor K a m e o M a t u s i t a for his invaluable help. CONTENTS Summary Introduction Acknowledgement 1. 2. S. 4. 5.
Part I Class convergences of distributions Distribution functions and their classes Inverse functions of one-variate distribution functions Uniqueness theorem for class convergences: the one-dimensional case Scaling and centering constants: the one-dimensional case Uniqueness theorem, and sealing and centering constants: the multi-dimensional case
Part II Infinitely divisible distributions 6. Preliminaries 7. Continuous amplitudes of non-vanishing characteristic functions After I wrote this paper I came to know a paper of N. A. Sapogov [19] in the Mathematical Reviews, vol. 12. But yet I cannot see it.
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
41
8. Infinitely divisible distributions 9. Infinitely divisible distributions in the generalized sense 10. Convergence theorem 11. 12. 13. 14. 15.
Part III Central limit theorem General case Lindeberg's and Liapounov's conditions Generalization of P. L4vy's theorem Feller's criterion and the attraction domains of the normal distributions Reduction to the one-dimensional case Class convergences of distributions
Part I I.
Distribution functions and their d a m e s
A measurable f u n c t i o n X, defined on a probability space 9 , and t a k i n g values in R~,, p-dimensional Euclidean space, will be called a iodimensional random variable (p----l, 2 , . . . ) . L e t us w r i t e X=(X1, X~,....X~) w h e r e Xj is t h e j t h component of X, j---l, 2 , . . . , p. Then
F(x)---F(x, x2,..., x p ) = P r {Xj(~) ~ x ~ , 3"----1, 2 , . . . , p} is defined for all x=(x, ~2,..., x~,) in Rp, w h e r e P r { . . - } means t h e (1.1)
probability of t h e ,~ s e t defined b y t h e condition {--. }. The function F defined b y (1.1) is called t h e d i s t r i b u t i o n function of t h e p-dimensional random variable X - ( X , X2,..., .X~,). In case
p=l, tP is monotone non-decreasing, continuous to t h e r i g h t
and lim F(x) -- 0,
lim
F(x)--- 1.
A n y function F satisfying all t h e s e conditions will be called a onev a r i a t e distribution function. In case p ~ l , t h e f u n c t i o n F defined b y (1.1) is monotone nondecreasing and continuous to t h e r i g h t in each variable and lim F ( ~ , . . . ,
j---- 1, 2 , . . . , iv,
F(x,..., x~)=l,
lira Xl~ "-. t Xj)
x~)----0,
~cO
[ F j ~ = F F ] Cy~..... ~ ) > 0 ,
for
xj~y~,
Here I F J= ' ~ is defined b y (1.2)
IF] =
'i=l
( - 1)'*, 7 F(v,),
3"=1,2,.
io.
42
KINSAKU TAKAN0
where v~=(v,, v~,,..., v~p) (i--l, 2 , . . . , 2~), each v,j is either ~ or yj, and
n(v~) denotes the number of lower simbols x~ among the co-ordinates of p~. [F]~ is the pth difference of F and the evaluation in terms of F o f Pr {xj
P(A)=f ... f
x,) ..4
and conversely P defines F
F(x)=P({y; yj~x~, j = l , 2 , . . . , p}) where x = ( x , . . . , x~,), y = ( y , . . . , y~,) and {y:C} denotes the set of y satisfying the condition C. Let P , P 2 , . . . be p-dimensional measures with distribution functions El, F ~ , . . . , and let P be another p-dimensional measure with distribution function F. Then lira P,~(A)=P(A) for every set of continuity of P, if and only if lim F,,(x)-~F(x) at every point of continuity of F.
When
these equivalent conditions hold the sequence {P,} is said to converge to P, {F,} is said to converge to F, and these are w r i t t e n as lira P,,=P and lira F,,---F, respectively. The definition of distance between two pvariate distribution functions G~ and G2 t h a t matches this convergence is the following (1.3)
d(G, G~)=min {~; G~(x-~e)-~ ~ G~(x)~ G~(x+~e)+~, for all z ~ R~}
where e--(1, 1 , . . . , 1)e R~ and rain {e; C} denotes the least ~ belonging to the set {~; C}. Under this definition the space of all p-dimensional distribution functions is a complete metric space, and limF,(x)=F(x) at all points of continuity of F if and only if lim d(F,,, F ) = 0 (see [20]).
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
48
If a sequence of distribution functions [F~} converges to a distribution function F, t h e n we have limF~(x--O)=F(x) at every point of continuity of F. We shall prove this in t h e one-dimensional case because of the simplicity of writing. L e t x0 be a point of continuity of F, t h e n for any given 9 ~ 0, t h e r e exists a ~ ~ 0 such t h a t
(1.4)
F(Xo)- <--
F(xo+8-O) <_ F(Xo)+
For this 8 t h e r e exists an i n t e g e r N such t h a t d(F, F ~ ) ~ ~ for all n ~ N, t h a t is,
F(x-8)-8~F,,(x)~F(x+8)+~,
--r < x < r
n~N,
hence (1.5)
F(x-~-O)--~F~(x--O)<~F(x+~-O)+~,
--~
n~_N.
F r o m (1.4) and (1.5) it follows t h a t
F(xo)- ~ - ~ "~ F,,(xo-- O) ~ F(xo) + r + ~,
n ~_ N.
Since e+$ can be chosen arbitrarily small, we have limF,,(xo--O)=F(xo) which completes t h e proof. L e t F and G be two p-dimensional distribution functions. If t h e r e exist a positive n u m b e r a and a p-dimensional vector b such t h a t
F(ax + b) = G(x) for all x ~ R~, t h e n we w r i t e F,-~ G. This relation --- satisfies t h e equivalence relation: F ~ F ; if F ~ G t h e n G ~ F ; if F - ~ G and G ~ H t h e n F ~ H. Therefore all p-variate distribution functions are classified by l e t t i n g F and G belong to t h e same class if and only if F--.G. T h r o u g h o u t this paper classes of p-variate distribution functions shall be i n t e r p r e t e d in this meaning. Two distributions in Rp are said to belong to t h e same class if and only if t h e corresponding distribution functions do so. A class of distribution functions and t h e corresponding class of distributions will be identified if no confusion occurs. A sequence of classes {Ks} will be called to converge to a class K, if a sequence of distribution functions {F~}, each F~ being chosen adequately from Ks for each n, converges to some distribution function F belonging to K. A p-dimensional distributfon will be called unit distribution if t h e whole probability 1 is placed in a fixed point in R~, otherwise non-unit distribution. W h e n p = l , ' u n i t ' or ' n o n - u n i t ' m a y
4~
KmsAzu TAr~ANO
be replaced by ' i m p r o p e r ' or ' proper '% A d i s t r i b u t i o n f u n c t i o n will be called unit or non-unit, according as t h e corresponding d i s t r i b u t i o n is u n i t or non-unit. All u n i t d i s t r i b u t i o n f u n c t i o n s f o r m a class which will be called unit class. O t h e r classes will be called non-unit classes. T h r o u g h o u t this paper, w h e n e v e r m o r e t h a n one random variable is involved in a discussion, it will a l w a y s be assumed, unless t h e c o n t r a r y is explicitly s t a t e d , t h a t t h e random variables are all defined on t h e same probability space ~.
2.
Inverse functions of one-variate distribution functions
T h r o u g h o u t sections 2-4 it will always be a s s u m e d t h a t a n y distribution f u n c t i o n is one-variate. L e t F be a d i s t r i b u t i o n f u n c t i o n and define f by f(y)---max Ix; F ( x - O ) ~ y t ,
(2.1)
0
Then f is a finite-valued f u n c t i o n defined on t h e open i n t e r v a l (0, 1), f(y) is non-decreasing w i t h y, and f is continuous to t h e r i g h t : f(y)---f(y+O), 0~y~l. To see t h e last e q u a l i t y hold i t is sufficient to show t h a t f(y)~_ f(y+O). Now for a n y positive n u m b e r e s u c h t h a t y < y + ~ < l , w e have, by t h e definition of f(y§ F(f(y§247 hence, F(f(y§ y + ~ . L e t t i n g e r 0 we have F(f(y§ ~ y, hence, f(y§ ~--f(y). The f u n c t i o n f defined by (2.1) is called t h e inverse function of the
distribution function ~: LEMMA 2.1
i f and only i f F(x-O) ~ y, (2.3) i f and only i f F(x) >__y, (2.4) x
x ~ f(y) x >_f ( y - O )
In the previous paper [20], I used also in the multi-dimensionalcase the term 'improper" or 'proper" in the same meaning as 'unit' or 'non-unit', definedabove. But it seems better not to do so, for usualy 'improper' or 'proper" is used in the same sense as 'singular" or ' non-singular', that is, a multi-dimensional distribution is called improper, or proper according as there exists, or does not, a hyperplane in which the whole probability 1 is placed.
ON SOME LIMIT THEOREMS OF PROB/IBILITYDISTRIBUTIONS
45
PROOF: (2.2) and (2.5) are immediate consequences of the definition (2.1). To prove (2.3), assume that ~ f ( y - O ) . Then for any s > 0 we have x + r hence, F ( x + ~ - O ) > y - s by (2.5). Let ~r then F(x) _> y. Conversely F(x) ~ y implies x ~>f(y-0). Thus (2.3) is proved. (2.4) is equivalent to (2.3). (2.6) follows from (2.2) and (2.3). From (2.3) and (2.4) we have the following THEOREM 2.1 A distribution function F is uniquely determined by its inverse function f . More explicitly, it holds that (2.7) F ( z ) = m a x {y; f ( y - 0 ) _ ~ x } , -~
f ( y - o), then F ( x - O) >_ y. (2.18) if To prove (2.10), assume that y0, hence f(y)
f(y) ~_ af ,(y) § ab from which it follows that (2.15) f ~(y) <__f(y)/a-b.
46
K~-SA~U TAKA~O
(2.14) and (2.15) imply
fl(y)=f(y)/a--b. THEOREM 2.3 Let f , ( n = l , 2 , . . . ) and f be the inverses of distribution functions F, and F. Then lim f~(y)=f(y) at every continuity point of f ~a.~ oo
i f and only i f lira F,(x)=F(x) at every continuity point x o f F. ~t . ~ o o
The ' i f ' p a r t is s t a t e d in Y. Kawada [12J, p. 182, w i t h o u t proof, and its special case is proved in P. L~vy [16], w48. PROOF: To prove t h e ' i f ' part, suppose t h a t lira F,(x) = F ( x ) at every continuity point x of F and let Yo be a continuity point of f. W r i t e xo=f(yo). As f is continuous at point yo, for any given 9 > 0, we have Xo-- 9
f(Yo § O)< xo § ~.
F(xo-- ~),~ Yo< F(xo + ~). are both continuity points of F, we have from the assumption lira F~(xo + ~ - 0) =F(xo + ~), "* ~ lira F.(xo -
= F(xo-
F r o m (2.16) and (2.17) it holds t h a t for sufficiently large N,
F,(xo - ~)< Yo< F,(xo + ~ - 0), which implies, from L e m m a 2.1, t h a t
n >_ N,
xo - ~< f ,(yo) < xo +r n ~ N. Since ~ can be chosen arbitrarily small, we have lim f ,(yo) = xo= f (yo), which completes t h e proof of t h e ' i f ' part. The ' only i f ' p a r t is proved in t h e same way. THEOREM 2.4 I f a distribution function F is strictly increasing in an open interval a < x < b , then its inverse f is continuous in the open interval F(a-O)
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS
47
F(a+O). Since for any ~<0, F(a--O)
Uniqueness theorem for class convergences: the one~imensional case
Throughout this and t h e n e x t sections we shall use t h e following notations: F and G w i t h or w i t h o u t a subscript denote distribution functions ; f and g denote t h e i r inverses ; if F and G possess subscripts, t h e same subscripts will be used for their inverses f and g; U denotes t h e distribution function of t h e u n i t distribution which places t h e whole probability 1 in t h e origin; a w i t h or without a subscript denotes a positive n u m b e r ; b w i t h or w i t h o u t a subscript denotes a real n u m b e r . If not otherwise stated, limits will be considered for n - ~ . First we shall note t h e special role of t h e u n i t class. THEOREM 3.1 Any sequence of cla~ses converges to the unit class. More explicitly, for any sequence of distribution functions {F,} there exists a sequence {a,} such thut lira F,(a,.)= U. This was first proved by A. Khintchine [14J and t h e proof is easy. Now the A. Khintchine [14] 's uniqueness theorem for class convergences can be s t a t e d as follows. THEOREM 3.2 Assume that lira F , = F , lira F,(a,(. + b,)) ----G, and that both F and G are non-unit. Then the limits lim a,=a>O,
lira b,=b Hence, F and G belong to the same
exist and it holds that G=F(a(. +b)). class. PROOF: L e t f , , f , g be t h e inverses of F,, F, G, respectively. By Theorem 2.2 t h e inverse of F,(a,(. § is given by f , / a , - b , . F r o m t h e hypothesis and Theorem 2.3 it follows t h a t (3.1) lim f ,(y) ----f (y), (3.2) lim f ,(y)/a,-b,=g(y) at every continuity point of f and g, respectively.
By t h e assumption,
48
E2NSAKUTAEANO
F and G are non-unit, t h e r e f o r e n e i t h e r f nor g is constant. Hence, we can choose y~ and y~ such t h a t 1 >Yl >y~>0, f(Y~)>f(Y2), g(yl)>g(y~), and t h a t both y~ and y~ are common continuity points of f and g, so t h a t both (3.1) and (3.2) hold for both y=yl and Y=Yv By making differences we have
lira {f
f
=f(y,)- f(y2) >o,
lira [f ,(y~)--f ,(y~) }/a, =g(y,)--g(y~) >O. By taking the ratio we have (3.8)
]ira an -- f(Y~) --f(Y~) = a(say) > O.
g(uO-g(u2)
F o r m (3.1)/(3.3)--(3.2), t h e n l i m b , = f ( y ) / a - g(y) =b(say). This holds for every common continuity point y of f and g. Since f and g are both continuous to t h e right, we have f(y)]a-g(y)=b for all y, 0 < y < l . Therefore we have g = f / a - b , and this t o g e t h e r w i t h Theorem 2.1 and Theorem 2.2 implies G=F(a(. +b)). THEOREM 3.3 Assume that lim F~=F. (It makes no difference whether
F is non-unit or unit). ( i ) i f lima~=a>O, (ii) i f l i m a , = + co, (iii) i f lim b,=b,
Then limF~(a~.)---F(a.); lim F~(a~.) = U; lira F~(. + b . ) = F ( . +b).
This is known (for instance, see H. Cram6r [2J, p. 254), and is easily proved, for instance, by making use of t h e inverses of distribution functions. According to Theorems 8.1 and 3.2, if a sequence of distribution functions [F~} converges to a non-unit distribution function F, and if for some sequences {a.} and {b~} t h e sequence {F,(a~(-+b.))} converges to a distribution function, t h e n t h e limit distribution function m u s t be F(a(. + b)) or U(-+b) for some a and b. W i t h respect to these circumstances we have t h e following two theorems. THEOREM 3.4 Assume that lim F~=F and that F is non-unit. Then lira F~(a~(. +b~))=F(a(. +b)),
(8.4)
i f and only i f
(3.5)
lim a~ = a,
lim b~-- b.
This was first proved by B. Gnedenko [9J, w
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
49
PROOF: Since t h e ' if ' part is a n immediate consequence of Theorem 3.3, it is sufficient to prove t h e 'only i f ' part. Assume (3.4)'. According to Theorem 3.2 t h e limits lim a ~ = s lira b , = b t exist and we have F(a(. + b)) = F(a'(. + b')) from which it follows t h a t f ( y ) / a - - b = f(y)/at--b ',
0 < y < 1.
Since F is non-unit, f can take at least two different values, therefore, it m u s t hold t h a t a l = a and b'=b. THEOREM 3.5 Assume that lim F ~ = F and that F is non-unit. Then lim F,(an(. +b~))--- U(- +b) i f and only i f lim a,~= + 0%
lim b~---b.
The ' i f ' part follows from Theorem 3.3; t h e ' only i f ' p a r t is proved by taking t h e inverses of distribution functions. The normal distribution with m e a n m and variance v is denoted by N(m, v). It is convenient to denote by N(m, 0) the unit distribution which has t h e whole probability 1 placed in t h e point m. As an ap~ plication of Theorem 3.2 t h e following fact is proved. I f a sequence o f normal distributions N(m~, v~) tends to a distribution L, then the limits (3.6)
lira re,---m,
lira v~=v
exist and (8.7)
L = N ( m , v)
(K. Ito [11], p. 187) PROOF: Let G denote t h e distribution function of t h e normal distribution N(0, 1), t h e n t h e distribution function of N(m,, v,) is given by G((- - m=)/V'V~). L e t F denote t h e distribution function of t h e limit Then we distribution L. Moreover, let us w r i t e G,~=G, n = l , 2, . . . . have
liraa =G, lira If F is non-unit, from Theorem 3.2 the limits (8.6) exist and w e have F=G((.--m)/z/v--) from which (8.7) follows. If F is unit, by Theorem 3.5 there exist the limits (3.6)with v = 0 and we have F----U(--m), from which (3.7) follows.
50 4.
I~mSAKUT ~ O Scaling and centering constants: the one~imensional case
Now we want to determine the sequences of scaling and centering constants. For this purpose, we shall discuss about dispersions and eentres of distributions which will play important roles as scaling and centering constants, respectively. We shall .begin with the dispersions of distributions. Let F be a distribution function and let @ be its characteristic function. Then the mean concentration function ~r, introduced by K. Kunisawa [15], of F is defined by
9 r(l)----lf e-"l q~(t)I' dt,
(4.1)
0<
oo,
0
It is easily shown that
~r(1)-~ f ~ 12 dF(x),
(4.2)
12+~
O
where / ~ = F . { 1 - F ( - . ) } is the symmetrization of F. From (4.2) it is seen t h a t ~F is a non-decreasing and continuous function defined on the open interval (0, oo) and we have
(4.s)
0 < ~r(l) ~ 1, ~'r(~) = 1, ~r(+ o ) = P ( + o ) - ~ ( - 0 ) = Z ~ = z r
(say),
where the p~'s are the jumps of F at all its points of discontinuity. Obviously St
~,~(o)=~A+o),
9 r(1)=0, for l < 0 . Then ~F is a distribution function. The inverse function D r of ~s~ is called the dispersion function of F. The value of D r at a point a will be called the a-dispersion of F. We shall need the following properties of the dispersions. LEMMA 4.1 For any distrib~ion function F, its dispersion function
D~ is non-negative and continuous. I f F is unit, (4.6) DA-)=0, 0
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
51
(4.7)
D.(a) > 0 i f and only i f > where I v is defined by (4.3). Moreover l ~ is invariant when t~runs over the same class. PROOF: If F is unit ~F(I)----1, l_>0, =0, l < 0 , from which (4.6) follows. Next, assume t h a t F is non-unit. In this case t~). is strictly increasing in the open interval (0, co), therefore, D~ must be continuous in (0, 1) by Corollary to Theorem 2.4. (4.7) follows from (2.4). The last part of the lemma is clear. LEMMA 4.2 DF(..+b)(a) = a- ~D•(a). LEMMA 4.3 D~,,,2(a ) >___D,,(a), j = l , 2, where F , , F~ denotes the convolution of F~ and F2. LEMMA 4.4 I f lira F~----F, (4.8) lira Dr,,(a) =D~(a), 0< a < 1. These properties of the dispersions are deduced from the following corresponding properties of the mean concentration functions. 9 _< a'=l, 2, lim ~z;,(l) = ~v(1), 0 < l < 1. Furthermore, we can prove t h a t i f F~ is non-unit then 9 ~t,,.~(1)<~t(1), f o r all 0 < l < 0% and Dv~,~(a)>D~t(a), for all a>2F~,F~. Note that (4.8) holds for every point in the interval 0 < a < l , as D~ has no point of discontinuity. This is the reason why we u s e the Kunisawa's dispersions instead of the P. Ldvy's dispersions, inverses of maximal concentration functions. We must now t u r n to centres of distributions. For any distribution function F, the real number c defined by
(4.9)
F
arc tan (x-e) dF(x) ~ 0
--oo
will be called the centre of F and will be denoted by e=c(F). Any distribution function F with c ( F ) = 0 is called to be centered. As is easily proved, centres have the following properties. T,~MMA 4.5 e(F(. +b))=e(F)-b. LEMMA 4.6
I f lim F~=F,
lim e(F,)=e(F).
52
"KINSAKUTAr~O
(See J. L. Doob [4], p. 408). As iV was noted in L e m m a 4.1 the Z~ defined by (4.3) is d e t e r m i n e d by t h e class K containing F, so t h a t it can be denoted by 2"8. Clearly ZK---1, or < 1 according as K is unit or non-unit. In the sequel we shall denote bY a a constant such t h a t 0 < a < 1. L e t F be a distribution function with 2~p
distribution function, belonging to K, with a-dispersion 1 exists and is uniquely determined. Now we can d e t e r m i n e scaling constants and centering constants in limit problems of distributions. THEOREM 4.1 Let K~'s (n----0, 1, 2 , . . . ) be classes with E ~ < a. For
each n let F~ be the centered distribution function, belonging to IQ, with a-dispersion 1. Then lim K~---Ko i f and only i f lira F~=Fo. PROOF: It is sufficient to prove t h e 'only i f ' part. Assume t h a t limK~=Ko. Then t h e r e exist G~'s such t h a t G~eK~ (n---0, 1, 2 , . . . ) and lim G~=Go. P u t D~--Da,,(a). Then D ~ > 0 (n---0, 1, 2 , . . . ) and l i m D , - - D o by L e m m a 4.4. According to Theorem 3.3 we have lim G~(D~.)--Go(Do.). P u t c~=c(G~(D~.)). Then limc~=co by L e m m a 4.6. Hence limG~(D~(. +c~)) -----Go(Do(- +Co)) by Theorem 3.3. By L e m m a 4.7 we have G~(D~(. +c~))=F~. Therefore lira F,=Fo. We shall m e a n by t h e a-dispersion and t h e centre of a random variable X, t h e a-dispersion and t h e Centre of t h e distribution function of t h e X, respectively, and we shall denote t h e m by Dx(a) and c(X). The dispersion and t h e c e n t r e of a distribution is defined by t h e corresponding ones of t h e distribution function of t h e distribution. COROLLARY 1 Let {X~ ; n = 1, 2, . . . } be a sequence of random variables with positive a-dispersions. Assume that f o r some sequences {a,} and {b~}
the distribution of X J a , - b n converges to a distribution L with positive a-dispersion. Then the distribution o f X~/D~--c~ converges to the centered distribution, belonging to the same class with L, with a-dispersion 1, whore D~=Dx;,(a) and cn=c(XJD~). COROLLARY 2 Let {X~; n = l , 2 , . . . } be a sequence of random variables
ON SOME LIMIT THEOI~EMSOF PROBABILITY-DISTRIBUTIONS
~8
with positive a-dispersions. I f f o r some sequence {a,} the distrili~tion function of X,/a~ converges to a distr~mtion funvtion F with positive adispersion D, then the distribution functions of X~ID. converge, to the F(D.). L e t f be t h e inverse of a distribution function F. Then any m l m b e r m satisfying F(m--O) ~ 89~ F ( m + 0 ) , or equivalently f(89189247 is called a median of F. L e t {F,; n = 0 , 1 , 2,. ,. } be a sequence of distribution functions a n d let m , be any median of /~, for each n. If lim F,=Fo and if t h e median of Fo is uniquely determined, t h e n we have l i m m , = m 0 by Theorem 2.3. L e t us note t h a t if a distribution function belonging to a class K has t h e uniquely d e t e r m i n e d median, t h a t is, if its inverse is continuous at t h e point. 89 t h e n every distribution function belonging to t h e class K has this property. As a result of these accounts we have t h e following THEOREM 4.2 Let the hypotheses of Corollary 1 to Theorem $.1 hold.
Moreover, assume that the median of L is uniquely determined, and let m, be any median of X~ for each n. Then the distribution of ( X , - m , ) / D , converges to the distribution, belonging to the same class with L, with median 0 and a-dispersion 1. 5. Uniqueness theorem and scaling and centering constants: the multi. dimensional case Now let us generalize t h e results of t h e preceding two sections to t h e multi-dimensional case. Let F, F ( x ) = F ( ~ , . . . , xp), be a p-dimensional distribution function and let F~,/7'2..., Fp be its one-dimensional marginal distribution functions defined by F,($)---- lira F(e, ~ , . . . , x,,), F~(~)= lim F ( x , t, x,,. : . , x , ) , . . . z2, . . - , = p ' ~ m
F~(~) =
lim
=1,~,
F(~I, x 2 , . . . , ~-1, ~),
. - - , x2D'~ w
- ~o < $< 0o.
::~., . . . , ~ - 1 §
We shall call t h e convolution of t h e marginal distribution functions F*=FI*F~*...*F~ t h e trace distribution function (or briefly trace) of t h e p-dimensional distribution function F. Then we have:
( i ) A ~dimensional distribution functio~ F is non-uni~ i f and only i f its trace F* is non-unit. (ii) I f the trace of a ~-dimensional distribution fun~ion F is F*, the trace of F(a(. +b)) is F*(a(. +b~ + . . . +b,)), where a > 0 and b---(bl,..., b,).
54
KINS~U TAL4~O
(iii) Let Fn(n=0, 1, 2, . . . ) be p-dimensional distribution funetion~ and let ~ be the eor~responding traces. Then lira F , = F o implies lira P , , =~oo .
The last (iii) follows from the fact that if Fn converges to Fo any marginal distribution function of Fn converges to the corresponding marginal distribution function of Fo (see, for instance, [20 , Lemma 3) and from the continuitT of the convolution. By the dispersion function of a p-dimensional distribution function F, we mean the dispersion function of the trace F ~ of F. We denote by Dp the dispersion function of F as in the one-dimensional case, so that D~.=Dp.. Let F be a p-dimensional distribution function, F 1 , . . . , F~ its onedimensional marginal distribution functions and c l , . . . , c~ the centres of F , . . . , F ~ , respectively. The vector c = ( c , . . . , c p ) will be called t h e centre of F. If the c e n t r e of F is 0, F is called to be centered. We denote by v(F) the centre of F. We have the following LEMMA 5.1 For any p-dimensional distribution function F, its dispersion function D~ is non-negative and continuous. I f F is unit DF(a) = O, O< a < 1. I f F is non-unit. Dr(a) >0 i f and only i f a ~ 2 F , where E~.. is defined by 2 ~ = 2 ~ . . , F* being the trays of F. Moreover 2~. is invariant when F runs over the same class. In the remainder of this section, it will always be assumed, unless the contrary is explicitly stated, that F and G w i t h or without a subscript denote p-dimensional distribution functions ; a with or without a. subscript de~otes a positive number ; b with or without a subscript denotes a p-dimensional vector ; U denotes the distribution function of the unit distribution which has the whole probability 1 placed in the origin, i.e., tl, if x j > 0 for all j, U(x~,..., xp)= O, if xj < 0 for some j. W e have the following lemmas, extensions of those fn the onedimensional case. ~Js:MMA 5.2 LEMMA 5.3
Dp(,.§ D2,,,~(a) > DFj~a), j = l , 2.
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
L m ~ A 5.4
/f
limFn=F,
55
liraD~,,(a)=DAa) at evary ~oin$ a in
0
LE~IA 5,5 c(F(- +b))=e(F)-b. L ~ r ~ a 5.6 14' lim Fn=F, lim e(Fn)=e(F). By making use of these lemmas we can generalize the results of sections 3 and 4 to the multi-dimensional ease. Before doing so, we notice the following THEOREM 5.1 Any sequence o f classes conve~rges to th6 u ~ class. More e~plicitly, f o r any sequenee~of distm~ution functions {Fn} there e~'i~s a sequence {an} such that l i m F , ( a ~ . ) = U . This is easily proved (see [20], Theorem 4). THEOREM 5.2 Assume that lira Fn=F. (It makes no diff~rerwe whether F is non-unit or unit). Then: ( i ) i f lim a , = a > O , Iim F,(a,.)=F(a.) ; (ii) i f l i i n a n = + ~ , llmFn(a~.)=U; (iii) i f lim bn=b, lim Fn(" +bn)=F(- +b). PROOF: To prove (i) and (iii), we shall prove that ff lim a n = a > 0 and limb~=b then limFn(an.+bn)=F(a'+b). Assume t h a t ] i m a n = a > 0 and lira b,,=b. Let z be a fixed vector. Then lira ( a ~ + b n ) = a x + b . Hence for any positive number 9 t h e r e exists a number N such that for aH n>_N az+b-~e_N. Therefore, if ax+b• a r e b o t h continuity points of F, it holds that F(az + b - ee) ~ lim inf F~(a~ + bn) ~ lira sup Fn(a~ + bn) ~ F(az + b + ee). As 9 can be chosen arbitrarily small, we h a v e lim Fn(a,~ + bn) = F(az + b) if x is a continuity point of F(a. +b). To prove (ii), assume t h a t lira a~= + oo. Fix an x=(z~, .... , x~). Case (1): x j > 0 for all j = l , 2 , . . . , ~o. For any positive number a t h e r e exists a n u m b e r N such that a , ~ > a e for all n ~ N . Then Fn(a,~)>Fn(ae) for n ~_N. If ae is a continuity point of F, it holds t h a t lira inf Fn(an~)~__ ~oo
F(ae).
Letting a-~r
we have lira i n f F n ( a ~ x ) ~ 1~ from which we have ~oa
limFn(a,,:v)=l,
Case (2): x j < 0 for some j.
For any positive mlmber a,
56
KmSAmTTAK~O
there exists an N such t h a t a,x~<-a for all n _~N. Denote by $'~j and Fcj) the marginal distribution functions of F , and F, with respect to the Yth component, respectively. Then we h a t e Fn(a,x)~F,j(--a) for n ~ N. If - a is a continuity point of Fcj) it holds t h a t lira sup F.(a,x) _~ F(~)(-a).
Letting a--~co, we have lira sup F,(a,x) ~ O, which implies
lim F,(a,z) = O. THEOREM 5.8 (5.1) (5.2)
Assume that lim F,----F, lira F.(a,(. +b,))=G
and that both F and G are non-unit. lim a ~ = a ~ 0 ,
Then the limits lira b,=b
exist and G----F(a(. +b)). PROOF: Since F and G are non-unit, Z r < 1 and Z a < l . Take an a such t h a t max(~Vr, Z a ) < a < ' l , and put D.=D~.(,), D=D~(,), D'=D~(,). Then Dr.c.,.r By Lemma 5.4, (5.1) and (5.2) imply lira D~=D, lira DJa,=D', from which it follows t h a t (5.3) lira a,=DID'=a (say) >0. From (5.1) and (5.3) it follows t h a t (5.4) lira F,(a~.)----F(a.), by THEOREM 5.2. Put c(F,(a~.))-~c,, c(F(a.))=c, and c(G)=d. Then (5.4) and (5.2) imply lira ca=c, lira (c,-b,)=d, by Lm~MA 5.5 and Lemma 5.6. From the last two equations we have (5.5) lim b , = c - d = b (say). From (5.4) and (5.5) it holds that (5.6) lira F,(a,(. +b,))=F(a(. +b)) by THEOREM 5.2. Comparing (5.2) and (5.6) it is seen t h a t
V-~F(a(. +b)). THEOREM 5.4 Assume that lim F , = F and that F is non-unit. Then lim F,(a,(. +b,))-----F(a(. +b)) i f and only i f lim a,=a and lira b,=b. PROOF: The proof runs in the same way as in Theorem 3.4. It is sufficient to prove that ff F(a(. +b))---F(a~(. +bt)) for a non-unit F then
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
5r/
a=a' and b=b'. Assume t h a t F(a(. +b))=F(a~.(. +b')) and t h a t F is non, unit. Take an a such t h a t DF(a)>0, and p u t D----D~,(a). Then from F(a(. +b))=F(a'(. +b')) it follows t h a t D/a=D/a' which implies a=a'. Put c=c(F(a.)). Then F(a(. +b))=F(a(. § implies c - b = c - b ' , hence b--=b'. THEOREM 5.5 Assume that lira F,-----F and that F is non-unit. Then lim F~(a,(. +b~))=U(. +b) i f and only i f l i m a , = + r and limb~---b. PROOF: The ' i f ' p a r t follows from Theorem 5.2. To prove t h e ' only i f ' part, assume t h a t F~(a~(. +b,))--U(. + b ) . Take an a such t h a t D~(a)>0 and p u t D=DF(a), D~=DF,,(a). Then we have l i m D ~ = D and lira DJa,=O, hence, lira a~= ~ and lim F~(a~.)= U. Put c~=c(F~(a,.)), t h e n we have lim c,---O and lira (c~-b~)---b, hence lira b~=b. THEOREM 5.6 Let K~(n----0,1, 2,. ..) be classes with 2:~,,
Infinitely divisible distributions
Prellmlnaries
A measure, non-negative completely additive set function, which is defined on all p-dimensional Borel s e t s will be called a ~-dimensional
58
KINSAKU TAKANO
measure. Let {g,} be a sequence of p-dimensional measures with ~.(Rr)< ~ and let ~ be another one with ~(Rr)< ~ . If (6.1) lim ~.(E) = ~(E) for every set E of continuity of ~, {~} will be said to converge to /~ and it is w r i t t e n as lim ~,=/~. Note t h a t lim ~ , = ~ implies lim ~(R~) =~(R~) for R~ is a set of continuity of any bounded measure. I f / ~ is a p-dimensional measure with /~(R~)< ~ , its Fourier-Stieltjes t r a n s f o r m is defined by (6.2)
e~t,=
~(t) = f
d~.
Rp
The Fourier-Stieltjes t r a n s f o r m of a p-dimensional distribution will be called a characteristic function of t h e distribution. The following f u n d a m e n t a l properties of Fourier-Stieltjes t r a n s f o r m s of p-dimensional measures will be used. Let ~ with or w i t h o u t a subscript denote a p-dimensional m e a s u r e with ~(R~)< ~ and let q~ be its Fourier-Stieltjes transform. If ~ possesses a subscript, ~ will have t h e same subscript. ( i ) q~ is continuous f o r all t and
I (t) I -< (ii)
~ is uniquely determined by ~.
(iii) I f lim ~ , = ~ and i f f is a bounded continuous function defined on R,, we have
% % (iv) lim ~ , = ~ i f and only i f lira ~ = ~ . And in this case lim ~.(t) --~(t) uniformly in every bounded t set. (v) I f l i m ~ , ( t ) = k ( t ) exists f o r all t and k(t) is continuous at the origin then ~ , - . some ~, and k=qa. (For one-dimensional case see, for instance, H. Cram6r [1], p. 121, additional note, or M. L o i r e [17], section I, L e m m a A). (iv) and (v) are called t h e P. I ~ v y ' s continuity theorem. We shall give a proof of (v) in t h e end of this section. (vi) Let ~1 and qa~ be the characteristiz functions o f distribution functions F1 and F2. Then the characteristic function of the convolution FI.F~ is given by ~q~2. Throughout parts II and III, unless t h e contrary is explicitly stated,
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
59
the following notations will be used: F, with or without subscripts, denotes a ~-dimensional distribution function; U denotes the distribution function of the unit distribution which places the whole probability 1 in the origin; for a point 9 = ( ~ i , . . . , ~p) in R~, I x I denotes its Euclidean norm, i.e., (6.3)
I x I = ( x ~ + z I + 9 9 9 + ~j~l~,
and N~l] denotes the g r e a t e s t of the absolute values of its components, i.e.,
(6.4)
il x II----max (I xl I, [ ~2 I, . . . , I x~ I). Let {F,~; l = 1 , 2 , . . . , l ~ , n = 1 , 2 , . . . } be a sequence of distribution functions. If (6.5) lira max d(F~z, U)=O, w h e r e d(F, U) is defined by (1.8), then F,,t, 1=1, 2 , . . . , l,,, will be called to converge to U uniformly in l(1 ~ l < l~) as n-+r It is easily seen t h a t (6.5) holds if and only if for each ~>0 (6.6)
lira
m xf
dF~,(z) = 0,
and this is equivalent to the condition t h a t for each r
(6.7)
m xf lz|~e
Lastly we shall give a proof of (v) following P. L6vy [163, pp. 4950, in the one-dimensional ease, for ~he completeness. Let /-/(z)= H ( z , . . . , xv) be a real-valued function defined on R~. If H @ , . . . , x~) is monotone non-decreasing and continuous to the right in each variable, and if [/-/3 ~ > 0 (see (1.2)) for every x = ( x , . . . , x~)and every y = ( y , . . . . y~) such t h a t x j ~ y j , j = l , 2 , . . . , p, then H will be called to be positively monotonic. We need the following LEMMA 6.1 Any sequence of distribution functions {F.} has a subsequence {F,t.} which converges to a positively monotonic function H(x) at every continuity point of the latter. PROOF: By the well-known diagonal method we can choose a subsequence {F~,,} such t h a t for each rational point r e r r limF;,(r)=Ho(r) exists. Ho(r) is defined only for rational points r and it is obvious t h a t 0 ~< Ho(r) ~< 1 and Ho(r)=Ho(r, r~, . . . . r~) is monotone non-decreasing in each variable. Using this Ho define a function H by
60
Kn~sAau TAKANO
//(x) = i n f _rio(r), w h e r e (r,, . . . . rp) >(z~, . . . , z p )
9
denotes t h a t
R,, rj>xj
for all j.
Then
H(x)=H(~, . . . . xp) is monotone non-decreasing in each variable and 0 _x and Ho(s)
H(~) = lim/-~(r) r~z
and H(~) is continuous to t h e right. L e t z = ( ~ l , . . . , z~) and y = ( y , . . . , y~) be two points of Rp such t h a t xj N yj for all j . Choose e = (e~. . . . , ~) and 8 = ( 8 , . . . , 8~) such t h a t x + ~ and y + 8 a r e rational points and 0 < e< 8. [F.'] ^ L e t t i n g n - ~ o , w e have L r/Z,0J~+,-qy+6> 0. F u r t h e r , l e t t i n g Then ~ ....~+ +~:> _ U. 84 0, w e have [ H ] ~ 0 by (6.8). Thus t / i s positively monotonic. I t remains to a s c e r t a i n t h a t F~(x) converges to H(z) a t e v e r y c o n t i n u i t y point x of t h e l a t t e r . To prove this, fix a point x. Choose e=(e,, . . . , ~) and 8 = ( 8 , . . . , 8~) such t h a t ej>0 and 8j>0 for all ] and t h a t both x - e and x + 8 a r e rational points. Since x - ~ < x < x + 8 , F~'(x-~) _< Fg(x) _< F ' ( , + 8). L e t t i n g n--> oo, we have H ( z - 2,) ~ lim inf F~(x) ~ lim sup F~(x) ~ H0(x + 8). L e t t i n g e 4 0 and l e t t i n g 84 0, w e have
H ( x - 0) ~ lim inf F~(x) ~ lim sup F~(x) <: H(x) which implies t h a t if z is a c o n t i n u i t y point of H, t h e n lim F~(~) = H(~). PROOF OF (v): I f k(0)=0, t h e n k=O, ~-~0, hence, (v) holds. I f k(0)~=0, t h e n w e can a s s u m e t h a t all ~,'s a r e probability m e a s u r e s w i t h o u t loss of g e n e r a l i t y . Since k(t) is continuous a t t h e origin and k ( 0 ) = l i m ~,(0)----1, for each positive n u m b e r ~ t h e r e exists a 8 > 0 such t h a t
Since
,tmf .(t>dt=f k(t)et, t h e r e exists a N = N ( e ) such t h a t
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
~.(t)dt > l - g ,
(6.9)
61
for all n>N.
Now
fI l t l l ~ *
I1~ 11-.~9
Rap
f
u~
.'"-,z, 1
tl t t[_.~6
~ f d~,nf ,,"*a,+f d~,.f e,"-at ilxll
lit l i : ~
.<_
II=[1 :> /
_. _(gs), II~ll
II f l l . ~ 6
r Ilzll>l
from which it follows t h a t
1(2~ f
(6.10)
<_f d..+,
9~(t) dt II ~ Ilage
il z l l < t
~
"
From (6.9) and (6.10) it follows t h a t for each e > 0
(6.11)
;
d~. > 1-Ilzll
for all n_> N = N ( 0 and for all 1>_L=L(~)=8/(eS). Now F1, F2,... be distribution functions defined by the probability measures #~, #,, . . . . By Lemma 6.1 t h e r e exists a subsequence {F~o~} which converges to a positively monotonic function H(x) at every continuity point of the latter. We can prove, from (6.11), t h a t H is a distribution function. Since H is positively monotonic and 0 _< H(x)_< 1 it remains to prove that (6.12) lira inf H(Xl,..., x,) --> 1, glP " ' "
(6.18)
* Zp-),
cO
lim sup H ( ~ , . . . , x~) _< 0, xj-p
j = l , 2 , . . . , ~o.
-oo
Take a point (y~,..., yp) such t h a t m i n y j > L . Then there exists a I such t h a t minyj>l>L and t h a t the point (/, .... , l) is a contirmity point of H. Since
F,#,.. ,z)> f
d#, >__1--~,
n>_N,
II~ll
letting n-~co through the sequence {n(j)}, we have
H(l,. . ., l) ~ 1 - , , hence H ( y l , . . . , y~) > 1 - , ,
if
min yj >L,
62
KINSAKU TAKANO
which implies (6.12) Next, fix Yl, . . . . , y~-l, y~+l, . . . , y~ arbitrarily and let y~< - L . Then t h e r e exists a continuity point ( z , . . . , z~) of H such t h a t y ~ z k , k - - - 1 , 2 , . . . , p and z j < - L . Then
F , ( z , . . . , z~,) ~ f
dlz, ~ 6,
n >_ N,
and letting n->~o t h r o u g h t h e sequence {n(])} we have H ( z , . . . , z~) --< ~, hence H ( y , . . . , y~) ~ ~, if y:< - L , which implies (6.13). Thus H m u s t be a distribution function. Write H = F , and let ~ be t h e characteristic function of F. Then from lim F....j)=F it follows t h a t lim ~o.<~)=~o by (iv). Hence k=~o and, since F is uniquely d e t e r m i n e d by ~o, any convergent subsequence of {F,} m u s t converge to F, hence, {F,} itself m u s t converge to F.
7. Continuous amplitudes of non.vanishing characteristic functions L e t z be a complex n u m b e r different from 0. Any real n u m b e r 0 such t h a t z---I z [ d ~ is called an a m p l i t u d e of z and is denoted by 0---amp z. a m p z is d e t e r m i n e d up to t h e m u l t i p l e of 2~r. I f ~---ampz and --~r<8~Tr t h e n 0 is called t h e principal a m p l i t u d e of z and is denoted by A m p z. L e t ~ be a characteristic function. A function 0 will be called continuous amplitude of q, if ~(t)----amp ~(t) for all t ~ R~, (7.1) 8 is continuous in R~ and (9(0)=0. LEMMA 7.1 Any non-vanish.ing characteristic function q, has a continuous amplitude, which is uniquely determined by 9a. PROOF: (i) First, we wish to prove that, for each T > 0 , t h e r e exists a function 8, {0(t) ; Itl --< T}, satisfying (7.1) in t h e domain .{t ; ltl--< T}. Since any continuous function defined on a compact space attains its m i n i m u m value there, we have ~ = m i n I ~(t) I > 0. As q~ is u n i f o r m l y continuous in {t; [t[ _< T}, to this v corresponds a > 0 such t h a t Iq~(t)--~v(t')J<,, Hence
for
It-t'J_<~,
Itl<_T, Jt'l<--T.
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
l~(t)l~(t')-ll
(7.2)
for
68
it-t'l<--a, Itl<--T, It'l--
Fix a t' such t h a t I t' l -< T. Then t h e function ~(t)l~(t~), of t, is continuous and takes values in t h e r i g h t half plane in t h e domain ]t--t'[~aj {t}~T, hence, t h e f u n c t i o n of t (7.3) A m p (q~(t)/q~(t')) (t' fixed) is continuous in and vanishes a t t=t'. F u r t h e r m o r e ,
AmPUl
(7.4)
< 2'
I t - t ' l <_ a,
l t I_< T.
Now let to be any fixed u n i t vector, i.e., I t o I = l . in t h e interval 0 ~ r ~ T, as follows :
O(to~)=~. A m p ~-,
~(kato) ~-A m p ~(Tto) ~o((k- 1)arc) ~(nato)
(7.6)
W e shall define
O(toT)
for na ~ T < ( n + 1)a, n = 0 , 1, 2, . . . .
If n = 0 , t h e first t e r m in t h e r i g h t side vanishes. for all t w i t h I tl --< T and it is seen t h a t (7.5)
]t - t ' } ~ 8, It{ ~ T,
Thus, 0(t) is defined
O(t)=amp~(t), for all t w i t h It I--< T, O(T2t)--O(T~t)=Amp (~(T2t)/q~(T~t)), if {~It--T2tl <--a,
w h e r e t is a v e c t o r and r~, r2 a r e real n u m b e r s . (7.5) is obvious f r o m t h e definition of 0(t) and q~(0)=l. To show (7.6), w e can a s s u m e t h a t t is a u n i t v e c t o r w i t h o u t loss of g e n e r a l i t y . Moreover, w e can assume t h a t I,-21_>[r~ { and ~-, _> 0 as (7.4) is t r u e and t can be replaced b y - t if necessary. Then t h e following t h r e e eases can o c c u r : Case (a) 0 _< na _< ~-~_< r~ < ( n + l ) a , for some n ; Case (b) 0 ~ na ~ ~ < (n + 1) a ~ ~ < (n + 2) a, for some n ; Case (c) - 8 < r ~ 0 ~ T 2 < 6 . For instance, in ease (b), w e have e(r2t)-- 0(~1t) = A m p ~p((n + 1)at) + A m p
~p(nat)
--Amp ~(T2t)
~o(T,t)
~o(~t)
Amp ~o(~,t) ~(nat)
~((n + 1)at)
(mod 2~r).
But by (7.4) {the second s i d e - t h e t h i r d side{ < 4 ~ = 2 ~ r ~
2
T h e r e f o r e '----' becomes ' = ', and (7.6) holds. In t h e o t h e r eases it will also be proved similarly. Now w e can show t h a t for a n y tl and t~ such t h a t I t l l ~ T and
{t2{--
KINSAKU TAKANO
64
(7.7) 8(t~)--6(t2)----Amp (q~(t,)/q~(t~)), if it1-t2 ] ~ 6. Fix t~ and t~ and take a positive number n such t h a t I t ~ l ~ n S , its] _< n~. It is shown by induction that
8(kt'~-8(~)=Amp
(7.8)
,n-
q~(ktJn) for k=0, 1, 2, .. ~(kt,ln)
which becomes (7.7) if k~-n. In case k = 0 it is clear. (7.8) holds for some k < n . Then we have
=
+
----Amp ~((k + 1)t,/n)
q~(kt,/n)
"'
n,
Assume that
+
~(ktJn) § Amp
~ Amp q~(kt2/n)
q~(ktdn) ~p((k+ 1)tJn)
= A m p ~((k+ 1)tJn) q~((k + 1)tdn)
By making use of (7.4), ' - ~ ' becomes ' - - ' and we have (7.8) with k replaced by k + l . Hence (7.7) holds. From (7.3) and (7.7) it is seen that 8 is continuous in {t; Itl--< TJ which together with (7.5) completes the proof of this step. (ii) We shall show t h a t if both 6~ and ~2 are continuous amplitudes in {t; I t [ ~ T}, then e~=02. Since 6~(t) -- 62(t) (rood 2~r) for all t, 8~-02 can take only values 0, +_2~r, • .... On the other hand, 8,--02 is continuous. Therefore, ~ - 6 2 must be a constant. Hence for all t O~(t)-62(t)----O~(O)-82(O)=O, i.e., ~=~2. (iii) Denote by 0r the continuous amplitude in {t ; I tl ~ T} defined in (i). Then from (ii) we have 8r=6~, for [tl -%
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
65
(7.9) log ~ = log 1'9 1+/0. By definition ~ = l o g 9 is characterized by the following conditions: ~(t). is a continuous function o f t, vanishes at the origin t = 0 , and ~p(t)= exp {~(t)}. The continuous logarithm of a non-vanishing characteristic function ~ is uniquely determined by ~o. Furthermore, for each positive integer n; the nth root o f a non-vanishing characteristic function ~ is defined by (7.10) q~'/"= exp [(log qa)/n}, where log q~ is the continuons logarithm of q~. If a non-vanishing characteristic function qh is equal to the nth power of another characteristic function ~o2, i.e., ff (7.11) ~v~= ~ then ~2=~1 v", for (7.11) is r e w r i t t e n as el~ "I~ which implies log q~=n log @s by the uniqueness of the continuous logarithm, hence, @s=e3~176176 In the sequel the continuous logarithm of a non-vanishing characteristic function ~ is called simply the logarithm of q~. We shall need in th~ sequel the following LEMMA 7.2 I f a sequence o f non-vanishing characteristic functions {~(t)} converges to a non-vanishing characteristic function q~o(t) at every ~oint t e R~, then {log~(t)} converges to log~o(t) uniformly in every bounded t set. PROOF: Let T be a fixed positive number. By the well-known theorem we have (7.12) lim ~,(t)=~o(t) uniformly in [ t I ~ T, from which it follows that (7.13) lim [ ~ ( t ) l=[q~o(t)[ uniformly in It[ < T, hence, lim min[ ~,(t) [=min [ q~o(t)[. ,~
Itl~T
Itl<'#"
Since min[q~(t)[>0 for all n, there exists an n > 0 such that Itt~T
(7.14)
rain I q~(t) i >~
for all
n = 0 , 1, 2, . . . .
ItlK~'
From (7.13) and (7.14) it follows that (7.15) lira log I~,(t) I--log I q~o(t) ! uniformly in I t l < T. Now, since {p,(t)} is equicontinuous in {t ; [tl _
66
KINSAKUTAKANO
(7.16) I~.(t)--~,(g)l<~ for I:t--gl ~ 8 , n = 0 , 1, 2, . . . . From (7.14) and (7.16) it follows that (7.17) I ~ . ( t ) / ~ . ( t ' ) - 11 < 1 for I t-- g I ~ 8, n = 0, 1, 2, . . . . Therefore for any fixed g the function Amp (~,(t)/~,(g)), of t, is continuous in I t - t r l ~ 8 and vanishes at t=-g for n = 0 , 1, 2, . . . . It is just so with the function 8,(t)-e,(t') of t, where 8, is the continuous amplitude of q~, for each n. But 8,(t)--8,(g) = a m p ~,(t)--amp ~ , ( g ) ~ A m p (~,(t)/~,(t')) (mod 27r). Therefore it must hold that (7.18) 8,(t)-~,(g)=Amp(q~,(t)/~,(g)) for I t - g l ~ 8 , n=O, 1, 2, . . . . From (7.12) and (7.14) it follows that lira {~,(t)/~,(t')} =~o(t)/~o(g) uniformly in It] ~ T, It'[ ~ T. And if ] t - g l ~ 8, the value ~,(t)/q~,(t') lies by (7.14) and (7.17) in the domain {z; I z - 1 I<1, I zl ~ ~}. From these two facts it follows that for any fixed t t (7.19) Iim Amp {~,(t)/~,(g)} = A m p {~0(t)/~o(g)} uniformly in t t - t ' l ~ 8. From (7.18) and (7.19) we have, for any fixed' t', (7.20) lira (8.(t)--O,(g)):go(t)--Oo(g) uniformly in I t - g l ~ 8. Let the open sphere S(t, 8/2) with centre t and radius 8/2 correspond to each point t in I t l ~ T. According to the Heine-Borel theorem, the compact set, [ t [ ~ T, is covered by the sum of finite number, say m, of such spheres. Denote the finite set of centres of those spheres by M. Let us assume that M contains the origin. Then from (7.20) for each e ~ 0 there existS an N = N ( e ) such that (7.21)
I (e.(t) - e,,(t')) - (eo(t) - eo(g)) I < ~/~
for n ~ N, I t--t'
I'~ 8, t' ~ M.
From this fact we can prove that
(7.22)
[e.(t)-eo(t)l<~ for n>N=N(e), itl ~ T . Fix a t, [ t ] ~ T. The segment joining the origin 0 and the point t is covered by the sum of open spheres with radius 8/2 and with centres 0=to, t~, t~,..., tk, such that
t jGM, S(tja, 8/2).-,S(t~, 8/2) 4 0, t ~ S(tk, 8/2). Then, since
j=O, 1 , 2 , . . . , k , j = l , 2 , . . . , k,
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS [t~-t~_, 1 < ~,
[t-t,l<
67
j = 1 , 2, . . . , k,
~,
we have, from (7.21), for n ~ N (7.23)
~ = l (e,(t~)-e,(tj_1))-(e0(tj)- e0(t~_3) I < 4m,
j=~, 2,..., k.
a = I ( o , ( t ) - e.(t~))- (Oo(t)- eo(t~)) I < ~lm. As 8,(t0)=a0(to)=0, from (7.23) we have
I e . ( t ) - Oo(t) I ~ ~ ,J; + ,J < k + 1 ~ ~ ~, for
n _>
N.
Since N does not depend on t, (7.22) holds, hence (7.24) lim a~(t)=ao(t) uniformly in It I --< T. (7.15) and (7.2r complete the proof, as log~o,(t)=log I q~,(t)l+ia,(t) for n=0, 1,2,... 8.
Infinitely divisible distributiol~
A distribution is called infinitely divisible if, for each positive integer n, its characteristic function ~ is the nth power of a characteristic function @, ~ = ~ " . LEMMA 8.1 The characteristic function o f a infinitely divisible distribution cannot take the value O. PROOF : Let ~ be the characteristic function. Then to each n corresponds a characteristic function ~ , such that ~ - - ~ , from which it follows ~hat I ~ . 12-- [ ~ [2/". Hence limlq~,(t)l~={10:
if (p(t)4:0, if q,(t) ----0. P u t @=lim [~o, [2. As [~o, [z is a characteristic function and ~(t)---1 in a neighborhood of the origin, by the continuity theorem @ is a characteristic function. Thus @ is continuous hence @(t)---1, ~ ( t ) ~ 0 for every t. I n this case q~,=q~I/"=eC~~ from which it follows t h a t lim ~o,(t)---1 at every t. Hence, the distribution function w i t h the characteristic function ~o, tends to the unit distribution function U. In the remainder of this part the following notations will be used: a with or without a subscript denotes a vector in Rp; a with or without a subscript denotes a non-negative definite m a t r i x of pth order, the (3",k)th elements, i.e., elements in the dth row and kth column, of
68
KINSAKU TAKANO
a and a~ are denoted by aj, and a~ ~, respectively, so t h a t a=(a~k) and a~---(a~)); and ~ with or without a subscript d e n o t e s a p d i m e n s i o n a l m e a s u r e with ~(Rp)< co and /~(I0})=0. LEM~A 8.2
The function ~ defined by
it'x ~ l + xrx _, 1-~-~,x,) ~ ~ is the logarithm o f the characteristic function o f a infinitely divisible distribution. PROOF: First we shall show t h a t e~:') is a characteristic function. It is easily seen t h a t t h e integrand in t h e right side of (8.1) (8.1)
~,(t) = i a ' t - 1 t'at + f ( e ' * - I
(e'*
1
it'z ~ 1 + x'x
is bounded and continuous with respect to (x, t) in t h e domain Ix I > 0, I t [ ~ T for each positive n u m b e r T. L e t us t a k e an e such t h a t 0< ~< 1 and let us consider t h e integral it'x ~ 1 +z'x e
Divide the integration domain into disjoint subintervals dk (k= 1, 9.,..., n), choose a point mc~ from each Jk, and make an approximation sum
it'
~(Jk).
x(k:Xc~)
Then we have
8 = ~ a~(e'"=~,- 1) + i5% where a k - l+xg~ck) ~(&) > 0, X~k,~Ck)
" X(k-XCk)
It is easily verified t h a t &(e''~ck~- 1) is t h e logarithm of t h e characteristic function of a p-dimensional random variable xck)Y, w h e r e y is a real random variable whose distribution is Poissonian w i t h m e a n &, hence, es is also a p-dimensiona! characteristic function. L e t t i n g max (diameter k
of Jk) tend to 0, S converges to a continuous function I,(t). Therefore, by t h e c o n t i n u i t y theorem, eP :') is a characteristic function. By t h e same theorem, as
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
69
lizll ~>o
=f (e'"--1-1 +~' ~--g~-~ira, ~) l+a'x d~, %
e 'g') is a characteristic function. On the other hand d ~''-~-''~ is a normal characteristic function. Therefore e*m is a characteristic function. Then, for each positive i n t e g e r .n, since @(t)/n is w r i t t e n in the form (8.1) with a, a, /, replaced by a/n, a/n, i~/n, e*(~)/" is also a characteristic function, hence e*r is the characteristic function of a infinitely divisible distribution. Furthermore, q*(t) is the logarithm of e*m, as it is continUous and vanishes at t---0. LEMMA 8.3 In (8.1) a, o, and ~ are uniquely determined by ~. PROOF:
(8.2)
Form ~(t) =@(t) - 1
f
@(s) ds II a--: ]l-.~ 1
1
f
6 ~~
e~'*
(.
~ sinxj'\ 1 +a'a _, J
~j
/
% where it is supposed that (sin ~)/~=1 for ~----0. Write
(8.8)
~a
,~(E)=f (1_ ii sin ~ ) 1 +~'a d +1
E r B~,
where ~(E)={10:
if 0 e E , if O t E . Then u is a p-dimensional measure with ,(Rp)< co and' it holds t h a t
(8
q~(t)= f r
d,.
Rp
Write (8.5)
g(x)=(1-- lrI s i n ~ ; ~ l +~zx j-1
a3
/
for
x ~ 0,
g(0)= 1 .
Xla
Note t h a t g is continuous everywhere and t h e r e exist two constants cl and c2 such t h a t (8.6) O
70
KINSAKU TAKANO
lira g(~) = 1, and (8.7)
Iim g(x) = 1/6. ~0
(See t h e end of this section). ~o, hence, by @. W r i t e
F r o m (8.4) v is uniquely d e t e r m i n e d by
which is uniquely d e t e r m i n e d by v. Then (8.3) becomes sinxA l+~f~ j .E
from which we see t h a t It is r e p r e s e n t e d by
v(E)=fI(1-IlSin~j'~
J x~ ]~Jl+~'~7-'dv~
EGB.,
and It is uniquely d e t e r m i n e d by vo. A f t e r all It is uniquely d e t e r m i n e d by @. Hence it is 5ust so w i t h
ia't-
(e''-i
it'
dx % By taking its real and imaginary parts it is seen t h a t a and a are uniquely d e t e r m i n e d by @. LEMMA
2
1 + ~':v /
8.4 I f the function
..(t)=ia~t_ 1 t'a.t + f (e'"'--1-- it'~ ~1 +~'~ dr. 1 + x'x/ ~'x % converges to a function l(t), continuous at the oroin, at every ~oint t e R~, then there e~ist a, a, and # such that lim a.=a, lim ( a . + r . ) = a + T , (8.8)
(8.9)
lim/,.(E) ---it(E)
f o r every continuity set E of ~ whose closure E does not contain the origin, and l coincides with ~ determined by
(8.10)
,(t)=ia't-lt'at+
f (d"'-I %
x
1 it'z ~ - ~ )h~l+x'~ d It'
where T.=(.r~ ~) and r=(r~k) are matrices with the elements
(8.11)
f
["
O N SOME LIMIT T H E O R E M S OF P R O B A B I L I T Y D I S T R I B U T I O N S
71
Conversely if
(8.9) holds, then ~ converges to tl,. PROOF: Assume t h e hypotheses of the direct part. Then er ~,=~o is, by the continuity t h e o r e m , a characteristic function. Let lx(t) be the logarithm of the characteristic function ~
(8.18)
q~4t):,~4t)-lf
4E.(s)ds,
Ils-tllKt .
Then we have (8.14) % with
(8.15)
fx (
sinz~ l +z'2: dl~+1
Let us write
q~(t)=l(t)-l f
(8.16)
l(s) Ir~-t I1_~1
From (8.12), 48.18), and (8.16) it follows t h a t lira ~ ( t ) = ~ ( t ) , for all t ~ R~. (8.17) Asr is continuous, by the continuity theorem t h e r e exists a ~-dimensional measure ,~ such t h a t lira ~ = ~ . (8.18) Now put
sin~ 1 + ~ then from (8.18) and (8.19) .it follows t h a t 48.20) lim ~,(E)=~(E) for every continuity set E of ~ which does not contain the origin. Owing to 48.19) ~ is represented by
(8.21)
_
fl(1 ,~ - ~ - ~
Hence by 48.20) we have
]~,
~.,
E~B,.
72
Kn~SAKU TAXANO
(8.22) lim ~,(E) = yo(E), for every continuity set E, not containing 0,. of v, where
Since any continuity set of/~o is a continuity seti of v, (8.22) holds for every continuity set E, not containing 0, of #o. Let us put
E ~B,.
~(E)=~o(~- {o}),
Then we have (8.23) lim ~,(E)----~(E) for every continuity set E of #, whose closure ~" does not contain the origin. We can show t h a t {~(R~); n = l , 2 , . . . } is bounded. From (8.6), (8.21), and (8.19) we have
#.(R,)< c.-' ~..(R,) < c.-' v.(R.). and (8.18) implies t h a t lim v~(R~)=~(R~). Now ~ , Can be w r i t t e n as
Hence {/~(R~)} is bounded.
~.(t)=~'t-~t'(~._ +T.)t+f
(8.24)
h(~, t) d#.,
.Rj,
where
h(x,
1 + x'x
2(1 + x'x) ) - ~ ' ~ - - '
x 4=0,
h(O, t)=o. It is easily seen t h a t for each to e R~, lim h(x, t)=0, and h(x, t) is bounded and continuous in x e Rp, [ t [ ~ T for each T > 0. Hence, by L e m m a 8.5 (below) we have (8.25)
limf h(x, t) d~.= f h(x, t) d,.
From (8.24), (8.25) and (8.12) it is seen t h a t the limit lim
{ia:t--l t'(o.+,.)t}
exists and hence also the limits lira a . - a (say), (8.26) lira ( a . + T , ) = a *
(say)
ON SOME L I M I T T H E O R E M S O F P R O B A B I L I T Y DISTRIBUTIONS
exist.
73
A f t e r all we have
~(~)=~,t-89
f h(~,t) d. %
which can be w r i t t e n as
lu
np
x--'~- f''
where (8.28) ,~=,7*-T, r being defined by (8.1t). N e x t we m u s t show t h a t a is non-negative definite. Now, ~ ( t ) can be written, for each , > 0 , as
I*l>e
+f h(~.,)d,.-{{,,,., +f .(*'~)'~a,.}. ,, I*lKe
l*l
Assume t h a t t h e set [ x ; [ x [ > ,} is a continuity set of /~. Since each tecta, except t h e last in t h e right side, converges as n tends to oo (ef. L e m m a 8.5), so is also t h e case w i t h t h e l a t t e r and we have
l(t)=i='t+f (,'"'-1
it'x . ~ l + e ~ dr' 1 + ogx / x'x
I cl>~
Ixl<~
L e t , tend to 0. and we have
2 ,~.,oo (
d fzl
~vx
Then t h e t h i r d t e r m in t h e right side converges to 0
t(t)=i='t+f (e'"*-l 1ct'x "~l+~'a de +x'x/ x'x I*1>o _ _ I lim lim
tt'a~t + f
("~)~dla.t .. I*l~e
Comparing this with (8.27) we have
t'at= lim,.,o,~-lim{t'a,,t +
f (t,x),.,dt~,,J x l=l~e
from which it is seen that ~ :is non-negative definite. t h e proof of t h e direct part.
This completes
24
KINSAKU TKKANO
To prove the converse part, let us rewrite (8.8) and (8.10) as
+.(t)=ia't-~ t'(~.+T.)t + (" h(~, t)dr,., nr
,~.(t)=~'t-89
+f h(~.t)d,.
We can show the boundedness of the sequence {~,(R~), n = 1 , 2 , . . . } , which together with (8.9) and Lemma 8.5 implies that lira ,L,=,/~ and completes the proof. Now notice that
(8.29)
""(")=~ f ~; ~""-V" 9
From the hypothesis (8.9) we have lim ~. (aS~)+ T~;))= ~] (aj~+ Tj~), hence, there exists a constant K, independent of n, such that (8.80) ~ (,~' +.,-~f)< K, n = l , 2, . . . . Since aS~) > 0, (8.29) and (8.30) imply that V,,(R,) < K, n = l , 2, . . . . LEMMA 8.5- Assume that (n=0, 1, 2 , . . . ) , ~.([01)=0, ~,(Rr) ~ K lim ~,,(E)--/~o(E) f o r every continuity set E of i~o whose closure E does not contain the origin. I f h(x) is bounded and continuous i n the whole space, and i f lira h(x) = 0 (8.31) X§
then
liraf h(~)d,.--f h(~)d,o.
(8.82)
Moreover, i f {x; }x I< ~} is a continuity set o f ~o,
(8.33)
limf
PROOF: For any $>0 we have
f h(x)d..- f h(x)dm R~
RIa
h(~)d.o+ f h(~)d..]+f h(~)d.~ Ixl>*
{~1<,~
Ixl-'~e
ON SOMR LIMIT THEOREMS. OF PROBABILITY DISTRIBUTIONS
75
If the set {x; [xK> 8} is a continuity set of ~o, the first t e r m in the right side tends to zero, hence,
li~2p] f h(x>d . . - f h(:>d.ol,_<2Ksup ,~,~. Ih(:>, This together with (8.31) implies (8.32) as 8 may be chosen arbitrarily Small. (8.33) follows from (8.32) and the fact t h a t
~i~ f h(~)e..--f h(~)e.o. I~1>~
Izl>e
Let d(t) be the characteristic function of any infinitely divisible distribution. Then d(t) cannot ta~e the value 0,, hence, has the continuous logarithm log d(t), and it is uniquely represented in the form (8.1). Conversely, any function ~(t) defined by (8.1) is the logarithm of the characteristic function of some infinitely divisible distribution. PROOF: Let d(t) be the characteristic function of an infinitely divisible distribution. It is shown in Lemma 8.1, t h a t d(t) cannot take the value 0. Hence the continuous logarithm, log d(t), exists. Then we THEOREM 8.1
have log d ( t ) = lira n (e (xogac,)>/~_ 1). Since e(S~
is a characteristic function, it is w r i t t e n as
eilog aco)/,,:=f eV,..,d Pn, a~
w h e r e P~ is a p-dimensional distribution.
Hence we h a v e
(e"-- 1) d P .
n(e ~'~162 R~
=
,.fR~ 1 it'~ dV.+nf(e"*-I + gatX ~
it'~ ~,,~,
~+-dx]"""
it'x ~ l + ~ x where
a,~= n f
! +vexta~ dP,,,
~.(E)=nf 1 +x'xx'a~d P.,
R~
Consequently we have R~
E+B~.
76
Kn~-s~u TAg.CNO
with According to Lemma 8.4, log d(t) can be represented in the form (8.1). The uniqueness o f the representation is proved in Lemma 8.3. The converse is shown in Lemma 8.2. THEOREM 8.2 I f a sequence o f infinitely divisible distrikutions con.verges to some distribution, the limit distribution is also infinitely divisible. PROOF : Let {~on} be the corresponding sequence of the characteristic functions and let ~o be the characteristic function of the limiting distribution. Then limq~(t)=~o(t) for all t, hence, for each positive integer k, lira [ ~n(t) [vk_[ ~o(t) iv, for all t. Since ~ is infinitely divisible, ] ~n [2/~ is also a characteristic function. And ] ~o ]-v~ is continuous. Hence, by the continuity theorem, [~o [vk must b e a characteristic function. As this holds for every k = l , 2 , . . . , ~o cannot take the value 0 (see the proof of Lemma 8.1). According to Lemma 7.2 lim log ~,(t) = l o g ~0(t), ~ ~ Rp, from which we see that, for any positive integer k, p~vk(t)=eo~ converges to po'/~(t)=e~176~.':)/k. By the continuity theorem poCk(t) is also a characteristic function. As this holds for every k = l , 2 , . . . , Po must be infinitely divisible. THEOREM 8.3 Let L~, n = 0 , 1, 2 , . . . , be infinitely divisible distributions defined by it'x ~ 1 +x'z
,.(t)=ia:.t-89 f (e"-I
l +-x-~/ x ~ ~'"
al~Ij "
Then L, converges to Lo i f and only i f the following three conditions hold: lim a~=ao, lim ( a . + r . ) = a o +to, lira ~.(E)=~o(E) f o r every oontinuity set E o f I~o whose closure E does not contain the origin, where r~ and ro are defined by (8.11). This follows from Lemmas 7.2, 8.3, and 8.4. Now we shall verify (8.7). by induction. When i s = l , it is easily verified. Assume t h a t (8.7) holds for some p. Then for any given e such t h a t 0<~<1/6, t h e r e exists a 8 > 0 such t h a t P
6
\
J-'
xa /
-~ 1
<
+e,'
if
0<~x}<8,
ON SOME L ~ I T THEOREMS OF PROBABILITY DISTRIBUTIONS
(8.35)
1
<(1
~--e
sina.+,.~]l+~+_____~,< 16--+~, '~,.,~, ~.,
if
77
0
Now assume that io+1
p
and w r i t e 10 I
T h e n both (8.34) and (8.35) hold and c a n be r e w r i t t e n as (8.36)
i_(1+~
(8.37)
1--
s~, < l ~ i s i n,, ~
.
<1--/----
/I ~
'~ e,
e l - - ,
\6
+e
l+a~+~ <
F o r m (8.86)•
x10+,
<1--
--a
I+
~+,
then N
p+l
+----J' .1 < II s i n a~
,
/\l+e~
\6
<1--(i--A( \6
/\l+s~
+ ~+,
. +
-
On t h e o t h e r hand w e have
s~,+~+, " 1 + s , + ~ <+ _1
e~, z,+_________/_<(1.+8) _~"~ e~,+~+~ l+s~+~+, l+e~ ~ 1 +:r10+, ~
l+e~
1+~;§
2 l+s~,+x~+l
H e n c e w e have
8p§ < 1I 1 +810+, l xj w h e r e s10+,=s~+~+,=~:]a}.
<1--
--e --
--e
A s s u m e t h a t ~ is so small t h a t
1
Then we have 1--
1 +2e
s~+~< 1 + e10+1
f r o m w h i c h it follows t h a t
H ~
a.~
<1--
1 + s:o+~
1 + .810+~
KINSAKU T A K A N O
78
, p+l
(8.88) 1
Thus, for any given ~, 0 < e < l / 6 , t h e r e exists a ~ > 0 such t h a t (8.38) holds if ~, m3< ~. Therefore (8.7) holds w i t h p replaced by p + l . 1
9.
Infinitely divisible distributions in the generalized sense
A distribution is called infinitely divisible in the generalized sense (following J . L . Doob [4], p. 129) if, for each ,1 > 0, its distribution function F can be w r i t t e n as a convolution of distribution functions F,, F2, . . . , F , ,
F=F1. F2 . . . .
(9.1) with (9.2)
f
dF~(x) ~ v,
*
F.
j = l , 2 , . . . , n.
IM>-~
If F is an infinitely divisible distribution function, for each n, F can be w r i t t e n as a convolution F = G, * G, * - . . * G, (n times) with G, tending to t h e unit distrdbution function U. Thus any infinitely divisible distribution is infinitely divisible in t h e generalized sense. In this section it is shown t h a t t h e converse is also t r u e . We shall begin with t h e following lemmas. LEMMA 9.1 Let {F,~; l--1, 2 , . . . , 1,, n----1,2 . . . . } be a sequence o f one-dimensional distribution functions. Write
a.,=f
x d F , z(x). l~l<1
I f , f o r each ~>0, I~l>-~
then we have
lim
max I ~l
Moreover, i f the convolution F,~ 9 F,2 * . . . * F,z converges to a distribution f u n c t i o n and i f ~1 is a positive constant, then~r e~ists a constant c, independent o f n, such that
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
Z]
d Fnt(X) ~ c,
79
n=l, 2,...
Izl~t 1
12:[<1
{Fro} and r (see M. Loire [173, section II, 4, and section IV, 2*.) LEMMA 9.2 Let {Fro; 1=1, 2 , . . . , l,, n = l , 2,...} be a sequence o f c depends on the given sequences
p-dimensional distribution functions.
Assume that f o r each r
ma f
(9.3)
9
I=l>-~
and that the convolution Fnl * Fnz*'" "* Fntn converges to a distribution function. Let q~nt be the characteristic function o f F,t, and put ant =
f
x d Fnt(~),
Ilzll < 1
~nt(t) = ~ . t ( t )
e -'~
- 1.
Then we have
(9.4)
lira ~, {"/nt(t)I~=0. n
l
PROOF: Let us denote the one-dimensional marginal distribution functions of Fnt by Fnz ( j = l , 2 . . . . , p), and let us put bn~j=f
x d Fm.~(x) , Ixl
bnt=(bml, brits,..., bntv) , a,,t=(a,,,, a,,n, . . . , a,,t~) .
Then we have
Ibn, -a.t,l=f *jdF.t(x)-f l~jI <1
I z j l < l , IIz]l~-I
Jlz [1<1
Ilzll ~ 1
and therefore (9.5)
I bnt-am [2=~,lbn,~-a,t~ [2 <__p f J
dFn~(x). II zlf ~
Japanese readers may also refer to Y. Kawada [12], Appendix H, lemma 6 and III, lemma 8.
80
Kn~SAKU T~XANO
Now we have I ,Ine(t) l= f (e"'c~-~z~-l) d Fn~(X) % I[ ~ I',~.l
llxl!
rlz[l~l
= 26 + I~+ 18 (say), where we used the well-known formula i e ' ~ - 1 - iS I < ~V2,
- r < ~ < ,,o.
But
l~=je'a.,-t',~.,f dFn,(~,)f=lean,f Cfz[l
dF~e(x)l
/l..e f l ~ t
--< I tl.I an, 121 --
[ a
I x--ane I' d Fm(z)
iixll < I
_<,t,.(f ,.-bne,.aF.~(~)+,bn.-,~.., ~) gl-ell< 1
<_,,,,(f
, ~_~.,,.~F..(~)+,/.).
II=II< I
by (9.5). Hence, we have (9.6)
I~n,(t)l<_(2+#/21tl+pltlDll+ltl2 f
Ix-b,~l~dFne(Z,).
I[*II
Now, by the inequality
f ~.,(~f ~.,(~ II$IJ > 1
I.cl> 1
and the hypothesis (9.3) we have (9.7)
lira, maxI,=lim, m a x f
dFn,(x)=O.
IIztf~ l
According to Lemma 9.1 lim,~m a x f
(x-b,,~)'dF,~,s(x)=O,
j = l , . 2 , . . . , ~.
Izl < l
From this and the inequality
f I~-~.,.'~.,(~-~f (~-~.,~.,~(~ .*!i<1
I*l
ON SOME LIMIT THEOREMS
OF PROBABILITY
DISTRIBUTIONS
81
we have
f
lira
(9.8)
Ix-b,~12dF~(x)~O.
Ifz f]< 1
From (9.6), (9.7), and (9.8) it follows t h a t (9.9) lira max I'~/.~(t) I=0. L
On the other hand, for each j(--1, 2 . . . ) , the sequence {F.~; l = 1 , 2 , . . . , 1,,, n = l , 2 , . . . } of one-dimensional distribution functions satisfies the hypotheses of Lemma 9.1 and therefore has a constant c~, the existence of which is mentioned in Lemma 9.1 with e~-----1. P u t v = m a x cj. Then IlziI :>l
I'~t~t
1=1~1
I;ztl <1
Izl< 1
These together with (9.6) imply that
(9.10)
~El~.~(t)l-< {2+p'/~ltl+(p+l)ltl ~} p c = K ( t )
(say).
l
Hence, (9.11)
~] l ~/.,(t) l~ ~ max l ./.~(t) l. ~ l ~/~(t) l g K(t) max l ./.~(t) l. i
t
1
1
(9.9) and (9.11) imply (9.4), q.e.d. LEMMA 9.3. Assume the hypotheses in Lemma 9~2.
Let V be a
neighborhood o f the origin, and put
~ = f x d F.~(x), V
~ , ( t) = q,.,( t) e -*'r~ - 1.
Then it holds that (9.12)
lira E ] Z/.~(t)12=-0.
By a neighborhood of the origin we mean a bounded Borel set which contains a sphere with the origin as its centre. PROOF: As V is h neighborhood of the origin t h e r e exist two positive numbers ~1 and ~2 such that
s(o, ~,) c r c
s(o, ~2),
where S(0, ~) denotes the o p e n sphere with centre 0 and radius e. { ~ ; ] x l < ~ } . We may assume t h a t e l ~ l . Write
82
KINSAKU TAKANO
Then for each j = 1, 2, .... , p, we have
la.,:-a,,,jl=lf z:dF,,,(:~)-f ~jdF,,,(z) llzll
1"
V, IIzll:>l
11zli
<(l+~,)f dF~(z), where V ~ denotes the complement of V. This together with (9.3) implies t h a t (9.14) lira m a x l a ~ j - - ~ j I - - - 0 , j = l , 2, . . . , p. By L e m m a 9.1 t h e r e exists a constant e such t h a t
zf
dF~(~)
Hence from (9.13) it follows t h a t (9.15) ~-~la~j--~l--~ (1+~2)c,
for all n.
for all n and j.
t
F r o m (9.14) and (9.15) it follows t h a t lim~-',la,~t:-gt,,zjl2=O,
j = l , 2, . . . , p.
Add these from . i = l to j----p. Then (9.16) lira 52,1 a,~-~,~ 12--0. n
1
On t h e other hand
I v.~(t)-
rr~(t) I = I ~ . , ( t ) ( e - ' " ~
-
e -''~) I
= 1 e - ' ' c ` ~ - z ~ _ 11 --< I t'(a., - a.,) I < I t I" I a., - a., I.
This t o g e t h e r w i t h (9.16) implies t h a t lira 52, l~/,,(t) - r~,,(t) l~= O, and this t o g e t h e r w i t h (9.4) implies (9.12). Thus t h e lemma is proved. TtmOimM 9.1 Let {F,~; l = 1 , 2 , . . . , 1,, n = l , 2 , . . . } be a sequence o f distribution functions. I f F,~ converge to the unit distribution funetion U uniformly in 1 as n tends to ~o, and the convolution F,~ 9 F** 9 . . . 9 F,z, converges to a distrt~ution funvtion F, then F must be infinitely divisible. The proof runs in t h e same way as in t h e one-dimensional case, b u t we shall give a proof, for t h e completeness following M. L o i r e [!7J.
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS
88
PROOF: Let ~,r be the characteristic function of ~F,~, let V be a neighborhood of the origin, and put
(9.17)
r
~.,(t) =~.,(t) e - , < , , - 1,
* lOg ~o,,,(t) =~"ar,,, t + ~.t(t). 8* Since log q~,a(t) is w r i t t e n as
log qJ.*(t)=ia'$ t + f (d ' ~ 1) d F,,(x + a,,), % q~*(t) is the characteristic function of an infinitely diYisible distribution (see the proof of Lemma 8.2). Now from (9.17) it follows that 1~.,(t) - ~*(t) [ = I r + ~ . , ( t ) - er ~ ' ) I ~< (1/2) r162176 I ~.,(t)I ~ _< 51 ~.t(t) I", as I ~.~(t) 1 - 2. Hence I ~.,(t)-~*,(t) [ _< 5 [ ~.,(t) [*. (9.18) Put ~ * ( 0 = H ~ -*(t) , ~ , ( t ) = H ~,(t). Since the convolution of any finite sequence of infinitely divisible 4istributions is also infinitely divisible, ~* m u s t be the characteristic function of an infinitely divisible distribution. On the other hand (9.19) I cp,,(t)--,p*(t) [ _< Y], [ ~p,,t(t)--~p*(t) I, l
by using the fact, easily proved by induction, t h a t I at [ ~ 1 and [ bt I --< 1, /-----1,2,...,l~, imply [ I I a t - I I b ~ t ~ . l a ~ - b t b From (9.18) and (9.19) it follows t h a t I ~ ( t ) - ~ * ( t ) I -< 5 2E I ~.,(t)I ~. l
According to Lemma 9.3 lim ~, [ ~,,,z(t)[~=0. Therefore (9.20)
lira I q~.(t)- q~*(t)I = O. n
Denote by ~(t) t h e characteristic function of F. thesis it holds t h a t (9.21) lira ~.(t) = q~(t).
Then from the hypo-
84
KmSAKU T ~ o
(9.20) and (9.21) imply t h a t lim ~o~(t)=~(t). Therefore, according to Theorem 8.2, F m~st be infinitely divisible. COROLLARY An infinitely divisible distribution in the generalized sense is infinitely divisible. 10.
Convergtmee theorem THEOREM 10.1 Assume that F~, l = l , 2 , . . . , l,, converge to U uniformly in I ~ 1 ~ l~ as n - ~ o . Then the convolution F,~ 9 F~2 * . . . * F,,, converges to the infinitely divisible distribution function defined by
q,(t)=ia,t_lt'at+f(e.,.1 it'x ~.i+x':v d/~, l+~x] z'x ~p i f and only i f the following three eonditons hold : (10.1)
% (10.3)
lim~f 1 .~o:, d~.,(,~+=.,)=~,.+f ~d,,, j,k=l,2, ..,., + x':r Re
(10.4)
Re
lim
~dx. dF,,(x +a,,)=~(E), ~ f~+~, .E
f o r every continuity set E, with closure not containing the origin, o f ~, where
a.,=fv
dF~,(~),
and V is an arbitrarily fized neighborhood o f the origin. PROOF: The notations adopted in the proof of Theorem 9.1 are used. By (9.20) lira ~,,,(t)=~ " -*(') if and only if lira 9*(t) = e*('). Now we have
log ~*(t)=~, {i<,t+f (e'"'-1) dF.,(. + =.,)} R~
= ~ {ia:,t+f 1it', +x-~zdF.&+a.,)+f (e"-I ~p
R~
it'x ~ dF,~(x+a,,)},
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
85
hence, R~
1 + x'xl
x'x
~vhere
,,(E)=~]z f~ 1 + x '~'~ ~ d F.~(x + a.~),
E~B,.
From Theorem 8.3 q~*(t) converges to e*m if and only if (10.2)-(10.4) ~imultaneously hold. Thus, the proof is completed. In the above proof, let (b.} be any sequence of vectors. Then we .have I ~,(t) e -'~'' -q~*(t) e-'"Z'l= l ~ , ( t ) - ~*(t)I, and
Therefore w e h a v e t h e f o l l o w i n g COROLLARY Assume that F.z, l---1, 2 . . . . , l,,, converge to U uniformly in 1 ~ l ~ l. as n - + ~ , and let [b.} be a sequence o f vectors. Then the .convolution F., . F ~ . . . . . F.~ . U(. +b~) converges to the distribution .function defined by (10.1) i f and only i f
(10.6)
lira
{u a,,t + f
1 + x'x
Rr
9(10.3) and (10.4) hold simultaneously. Let us notice that, in Theorem 10.1, i f the limiting distribution is non-unit the sequence {l.} must be tend to co. To prove this, assume t h a t {1.} h a s a bounded subsequence {l.<~)}, l ~ ) < L , j - - l , 2 , . . . , and put G,=F,1 * F.2 * . . . * F,,~.
Then, for each ~>0, it holds that ~ J d F~),(x) ~ n max f d F , oa(x)~O. fl*l>-- ed G,o)(x)~ ~,~.j) /,1~12 ~/L lxl> e/J~ Hence l.im G,o)-- U, %his contradicts the hypothesis.
86
KINSAKU TAKANO
Part III Central limit theorem II. General case
The normal distribution with mean vector m and covariance matrix a=(ajk) (matrix of the second order central moments ~#k), i.e., the distribution with characteristic function exp(im't-89 is denoted by N(m, a). Unit distribution, though often denoted by N(m, 0), are not called normal in this paper. The most general version of the io-dimensional central limit theorem is given by the following theorem, from which various versions of the central limit theorem can be deduced. In Theorem 11.1 the same notations as in the preceding three sections are used. THEOREM 11.1 Assume that
(11.1)
lira max f
I <:l <:l~, 1 7
f o r each ~>0.
dF,~(x)=O,
Izl>-e
Then the distribution defined by the convolution F . , . F.~ . . . . . F ~ . . U(. +b.) converges to a normal distribution N(0, a), i f and only i f the followin~ three conditions hold:
(11.2) ]*l~e
r
d V
ir
lr
ir
/, k = l , 2 . . . . ,p, where V is an arbitrarily fixed neighborhood o f the orig~n. Let X.I.X.2 . . . . , X.~. be independent random variables with distribution functions F.~, F . 2 , . . , , F ~ . for each n ~ l , 2, . . . . Then (11.1) means that X.~, l=1, 2 , . . . , l., converge to 0 in probability uniformly in 1 ~ l ~ l . as n + ~ , i.e., t h a t each t e r m is asymptotically individually negligible. Under (11.1), (11.2) is equivalent to the condition t h a t max I X.~I converges to 0 in probability as n ~ r i.e., t h a t the g r e a t e s t l
t e r m is asymptotically negligible (P, I_~vy [16], w34).
Let us put
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS
x~,(~)=
0,
if
87
X.,(~) r V.
Then (11.2) implies that the difference between the distribution functions of X~I+.--+X,~,,-b,, and X ; q + . . . +X;{~.-b. is asymptotically negligible (see section 13), (11.3) means t h a t the mean vector of X ' , + . . . + ~ , ~ . - b . ,converges to the mean vector of the limit distribution, and (11.4) represents that the covariance matrix of X ~ t + . . . +X'~,,-b,~ converges t o the covariance matrix of the limit distribution. Let V, and V~ be two neighborhoods of the origin. Then under (11.2), (11.3) and (11.47 with V=V; are equivalent to (11.3)and (11.4)with V=V=, respectively (see Lemma 15.1). PROOF OF THEOREM 11.1 According to Corollary to Theorem 10.1 the distribution defined by F~I * F.~ 9 -.- 9 F.~. 9 U(. +b.) converges to N(0, a) ff and only if the following three conditions hold: ,(11.5)
lim
x'x d F., (z + a.,) = 0, ~ f l +=,~
if
/~ $ 0,
.E
l+~,xdF.,(a+a.,))-5.]=O,
<11.6)
lim [Z(a.,+f
(11.7)
lira~ f _ x'~.-d ~., (~+=,,)--~,, ,i,k=l, 2 , . . . , p ,
where (11.8)
a., = f
x d F.~(x), Ir
Now (11.5) can be replaced by
lirn ~ f
x'x dF,~,(x+a,,,)=O,
J~l~e l+x'W
for each
hence, by
(11..,)
for each I*l>-~
And this implies that lira. ~
f dFo,(=+..a=o,
f r o m which it follows that
~>0.
e>O,
88
K ~ s A ~ u TAEANO
lim~f
*
1 + x'x
dFm(x+ a,,)=0,
lim~f~dF,,E(x+a,,~)=O,
a', k = l , 2, . . . , p.
Therefore, under (11.5'), (11.6) and (11'.7) can, respectively, be replaceff by
(11.6')
iim [~ (a.,+ f 1+~', ~U"(*+a"))-b~ =0' V
(n.7,)
limE f
*'~" dn,(~+~,,)=~,
V"
a',k=l, 2,...,~.
1+ x'x
Thus, (11.5)-(11.7) are equivalent to (11.5')-(11.7'). Next we shall show that under (11.5'), (11.7') is equivalent to the following condition:
lim~ f~,jx.dF,,,(~+a.,)=.~., Lk=l,...,p.
(11.7")
y-
To prove this, since
xjxk _x~xk oc'x x.~k 1 +z'x 1 + x'x ' it is sufficient to show that each of (1L7') and (11.7"), together witit (11.5'), implies lim Yfidf x~x,- x'X.-dF,,,(z+a,,~)=O. 1 +r V
(11.9)
Notice that there exists two positive numbers ~1<~2 such that S(O, ~,) C V C S(O, ~2). Assume (11.5')and (11.7').
Then dividing __f into f v
+__f )', Ixl<~
we have
v, lzl~
lira sup ~ f x~x, -x-'~-TzdF,,,(x +a,,,) 1o
N lira= +sup I"~ oo
f 1 ~'~-a F.,(~+=.,)+~.:-uf aF.,(~+=.,)] ~_ g t g
V
:
I*l~n
72 Z ~rjj,
from which (11.9) is deduced as ~ ,may be chosen arbitrarily small. Thus (11~5') and (11.7') imply (11.9). This holds even if (11.7') is exchanged by (11.7"). Therefore under (11.5'), (11.7') is equivalent t~
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS
89
(11.7"). It is similarly proved that under (11.5') and (11.7"), (11.6') is equivalent to the following condition: (11 .6")
lim [Y~,(a,~ + .f] ~ d F,z(~ + a,~))- a,] = 0. n
V
By now it has been shown that (11.5), (11.6), and (11.7) are equivalent to (11.5') (11.6") and (11.7"). On the other hand, 411.3) is rewritten by (11.8) as (11.3') lira (~-~,am-b,O-=O. n
l
And under (11.2), (11.4) is equivalent to
(11.r
Nf g
j, k=l, 2,..., p. a,,z=(a,m,a,,~2,..., am~).
where I t is left to show that (11.5'), (11.6"), and (11.7") are equivalent to (11.2), (11.3') and (11.4'). Now, from (11.1) and (11.8) it is easily proved that (11.10) lim max l a,~ ]=0. By (11.10), (11.5') is equivalent to (11.2). This follows from the fact that for each s > 0 there exists an N=N(~) such that max I a,~ ] < ~]2, for n >_.N, l and hence
f
d F,~(x)
d F,,~(x)
dF,~(x),
for
n _> N,
Ixl~/"
where the second side is equal to
f
dF,~(~+a,,~).
Under (11.5 ~) or,
Ixlb_~
equivalently under (11.2), (11.6") is equivalent to (11.3'). we shall deduce (11.11)
To prove this
f
lira
t"
from 411.2) : V
'V+~|
f
Ct"+a.m),','r
V+an~
V
V
f
trt',fTr+%lf
a..,,f
v"a
90
K m S A K U TAKANO
<_::fI:=l~e 1 <
4::f
f
I~l~el/2
dF,,(x),
l~l>~i
n >__N(E,),
I~l~ib-'
V + a = {x; x = y + a , y ~ V}, hence,
where
i
,,_>N(:,), g
S - - - , , 2 , . . ., ,,,
I,~l~el/2
from which (11.11) follows by (11.2). Under (11.5') or equivalently under (11.2), (11.7") is equivalent to (11.4'). To prove this we shall deduce V--a~
V
from (11.2) and (11.5'):
I~,f .~,~,dFn,(=+a.,)-~f~.,dF,,,(.+a.,) I P--~ill
P
CV-am),~
V,~(V-an/)o
l~l~$1
Izlaei
hence, (11.12) holds. Thus the proof is completed. In the sequel, unless the contrary is explicitly stated, the following notations will be used : Xn,, X , z , . . . , X,~n denote independent p-dimensional random variables with distribution functions F,1, Fn~,..., Fn~, for each n=l, 2,... ; &=xn,+x.~+
9 9 9 +x~, n;
a, denotes a positive number and b, denotes a p-dimensional vector for each n = l , 2 , . . . ; X, with or without a subscript, denotes a p-dimensional random variable and F denotes a p-dimensional distribution function. COROLLARY 1 Th~ distribution of (&-bn)/a, converges to a normal distribution N(0, a) and X~da~ converges ~o 0 in probability uniformly in l, 1 ~ 1 ~ l~, i f and only i f the following three conditions hold: Izl~eaii
(11.14)
lim 1 n
an
altv
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
(11.15)
tim 1 an
:g (f
91
F.,(x)-f x,dFn,(x)f ..d F.,(x))=o,.
anT*
a~ V
any
i,k=1,2,..., p, where V is any neighborhood of the origin and a~V= {x; x=a~y, y e V}. This is a generalization of W. Feller [6J's Satz 1. Usually a , V is taken as IxT 0 for all n, and put D . = D~n(a). Then {X.z} obeys the ventral limit theorem i f and only i f (11.16)
for each ~ > 0 ,
lim lixll~s~
and the limits (11.17)
lim ~l---jS-~,( f
f
x~x, d F n , ( x ) - f
li*II
tlzll
flzll < ..D,,~
s exist.
(say) . . . , p,
In this ease the distribution of
(11.18)
converges to the normal distribution N(0, a) with mean vector 0 and second order central moments oj~ defined by (11.17). PROOF: If (11.16) and (11.17) hold, (11.13)-(11.15) hold with an=On, b ~ = ~ f
xdF,~(x),
V = [ x ; [Ix][ < 1},
IIz$1< a n
hence, t h e distribution of (11.18) converges to N(0, a) by Corollary 1; and X~dD~ converges to 0 in probability uniformly in 1 _< l < l~ by (11.16). Conversely assume that, for some sequences {ad and {b~}, the distribution of (Xnt+--"X,~n-b,)/an converges to a normal distribution
92
Kms~u
'TAK~NO
and X~/a~ converges to 0 in probability uniformly in 1. Then, since D~/a~ converges to the a-dispersion D of the limit distribution by Lemma 4.4 and since 0
Lindeberg's and Liapounov's conditions
Let X be a random variable with E(IX[~)<~o *. Then E ( ] X - E X [ ~) will be called the variance of X and will be denoted by v(X). have v(X)=E(I X--EX]~)=E(I Xl~)-] EXI ~,
Then we
Let F be a distribution function with f ix[~dF(x)
and put
m = f x d F(z).
Then
%
R9
f
lx-ml2dF(x)
will be called the variance of the distribution defined by F. The variance of a multi-dimensional distribution is equal to the sum of the variances of its one-dimensional marginal distributions. A normal distribution N(0, a) with mean vector 0 and variance 1, ~. a~j--1, will 3 be called a normalized normal distribution. Theorem 12.1 Suppose that each of X~z has the vanishing mean vector and the finite variance and put s.---- (v.1 + V~2+ ' - " + v.,.) 89 Let N(0, a) be a normalized normal distribution. Then the distribution of (X~I +.--+X,z,)/s, converges to the N(0, a) and X,z/s~ converges So 0 in probability uniformly in l ~ l ~ l~ as n->~, i f and only i f f o r eaehe>O (12.1)
lim l ~ ] f n
8a
x~x,dF,,(x)=a~,,
i, k = l , 2 , . . . , p.
~ "/
The condition (12.1) is a generalization of the Lindeberg's condition well-known in the one-dimensional case. The theorem can be proved * E denotes ' mean value of" or ' mean vector of'.
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS
93
by making use of characteristic funetions in the same way as in the one-dimensional case, but here it is proved as a consequence of' the results of the preceeding section. PROOF: According to Corollary I to Theorem 11.1 it is sufficient to prove that (12.1) is equivalent to the set of the following three conditions: (12.2)
lim
~f
dFm(x)=O,
for each
~>0,
I~.l.'>ean
(12.3)
xdF,,~(x)=O,
lira 1---'~-2,f n
8n
Z ,.I
Iz l < %
Izl<3n
[xl~dF,~,(m)='~..aJ#=1,
limlNf 8.
l~l
3", k = l , 2, . . . , iv. Add (12.1) with j = k from 3"=1 to j = p . Then
Assume (12.1). n
Izl<~n
~
for each
e>0,
5
Ixl<~$n
~herefore, since from the definition of s, lira ~1 ' ~ xl 2
(12.5)
z
f I dFn,(x)=l,
i t holds that (12.6)
lim--~,fl
Ixl~dF,,z(x)--0,
for each
~>0.
Izl:>~Sn
(12.2) follows from (12.6) and the following 5nequality: 1 (12.3) follows from (12.6) and the following inequality:
88
I~l
i=l:>an
which follows from
f I~l
.
Iz I ~ n
Izl~-an
where we used that the mean vector of F,, is 0. (12.1) with ~=1 and the fact that
Ixl~Jn
(12.4) follows from
94
KrNSAKU TAKANO
1 I.~l
r~l<,'n
la=l<+n
Conversely assume that (12.2)-(12.4) hold.
From (12.4) we have
(12.7) lim(1E]f
xdF.,(oS:')=E]~,,~=l.
n
\8;
Ixl'dF.,(x)--lz f 8n
~ */[xl<~n
Form the difference (12.5)-(12.7).
lim(ly~' ..s;.
~ "/
I-
:i
Then
Ixl'dF,,,(x)+l~l f xdF,.(x)")=O,
, --
l:12sn
lxl<~n
from which it follows that
(12.s)
lim~p,f n
8,,
,~eF~,(~)l =o.
~t . fxl<**t
On the other hand
(12.9) Ixl<~'n
<_
f
<
I Izl<,~n
Izl<*n
Izl<~n
Izl<3 n
'::
hence, by (12.8), the left side of (12.9) tends to 0, and (12.4) becomes
(12.10)
liraI n
8~
~-~,fx~:rkdF,,(x)=a:k. ~ J
(I Ixl
Moreover
(12.11) Ixl<~*n
Izl
+-1
1N
Izf-_e,Bn
I~1<~ an, r-~l>~,n
I~1 >_..8n
Izl>__.E 8n
hence, by (12.2), the left side of (12.11) tends to 0, and this together with (12.10) implies (12.1) Next we shall consider the case when X,1, X,~, . . . , X ~ , are uniformlybounded.
ON SOME LIMIT THEOREMS OF PROBABILITYDISTRIBUTIONS COROLLARY 1 Let M , = m a x sup IX,~(o)I. Suppose that E ~ , = 0 all n, 1 .and that (12.12)
95
for
lira M, __-0. Sn
Let N(0, a) be a normalized normal distribution. Then the distribution of (X~I + . . . + X,~,)/s~ converges to the N(0, a) i f and only i f (12.13)
fzjx~dF,~(x):a~k,
limlE n
87.
z./
j, k : l , 2, . . . , p.
R~
PROOF: From (12.12) X.Js,. l = 1 , 2, . . . . l., converge to 0 uniformly in 1 with probability 1, hence, in probability. Next, for each e>0 there .exists an N such that M,,
fx~kdF,~(x)=fx~,dF,~(x),
for
n~N.
'Therefore, (12.1) becomes equivalent to (12.13) and the corollary follows ~rom the theorem. COROLLARY 2 Suppose that EX,~=0, and for some e>O,
f%ix, I"+~dF.~(x)< co .for all n and l, and that lim-~l.p ~ f
Jx] '§ dF,,(x)=O. % Let N(0, a) be a normalized normal distribution. Then the distribution 9o f (X~I+... +X,,~)/s~ converges to the N(0, a) and X,~/s,, 1 ~ l ~ l , , vonverge to 0 in probability uniformly in l, i f and only i f (12.13) holds. (of. W. Hoeffding and H. Robbins [10], Appendix). PROOF: For each e.>0 (12.14)
n
Sn
~rhis together with (12.14) implies that for each e > 0 lim 1 ~ , f n
S~,
Z J
x~xkd F,~(x)=O,
3, k = 1, 2 , . . . , p.
96
KmsAxu T ~ A N O
H e n c e , (12.1) becomes equivalent to (12.13) and the corollary follows from the theorem. (12.14) is called Liapounov's condition. 13.
Generalization of P. L~vy's theorem
Let {X.} and {Y.} be sequences of p-dimensional random variables. If for any pair of sequences {a.} and {b.] the convergence of {(X.-b.)/a.] in distribution implies the convergence of {(Y~-b~)/a,} in distribution and conversely, then [X.} and {Y~} are said to be equivalent with. respect to the convergence in distribution. LEMMA 13.1 I f lira P r { X , ~ Y , } = O , {X,} and {IT,} are equivalen~
with respect to the convergence in distribution. PROOF: This is obvious from the inequality sup[F~(a~x+b~)-G~(a~x+b~)[ ~ Pr { X ~ I:,}, zE~
where F , and G~ are the distribution functions of X, and Y,, respectively. LEMMA 13.2 Let X and Y be p-dimensional random variables and let Dx and Dr be the dispersion functions of X and Y, respectively. I f Pr (X ~ Y) <: 8 < 1/(4p), (13.1) Dr(a-2pS) ~ Dx(a) < Dr(a+2p3) for 2pS
[~x(1)-~r(l)[= i
12
l*
_
12+(y_y,)~ )[
12
12+(y-yt) 2 ~Pr(X-X'~
Y-Y')
Pr (X ~ Y) + Pr (X t ~ Y') ~ 25, Fix an a such that 0 < a < a + 2 8 < l , and put Dx(a)=l. since ~x is continuous at the point l, ~x(1)=a and v (g) _< + +
If l > 0
hence, l~Dr(a+2~). This holds also when / = 0 . Therefore Dx(a)~-~Dr(a+28) for 0 < a < a + 2 8 < l . Interchanging X and Y, we have
then,
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
97
for 0 < a < a + 2 8 < l . (ii) Case: p > l . Let X = ( X , X~, . . . . Xp) and
D~.(a) ~ D x ( a + 2 8 ) ,
Thus, (13.1) holds for p = l .
Y----(Y,, Y2, ..., Y~). Let X'=(X,', X',..., X~) and Y'--(Y', Y2',..., Y') be other p-dimensional random variables such that X~, X~, . . . , X~ are independent, Y[, Y~, . . . , Y~ are independent and (Xf, Y~) has the same distribution as (Xj, Y~ has for e a c h ] = l , 2, . . , , p. Put X * = X ( + X ~ ' + . . . +X~ and Y*= Y[ + Y( + . . . + Y~. Then Pr Yj) P r ( X * ~ Y*) ~ . ~ . , P r ( X ~ ' -" p Pr ( X ~ Y) ~ p~. Hence, by the one-dimensional case, (13.1) holds with X and Y replaced by X* and Y*, respectively, which implies that (13.1) holds since by definition D x = Dx* and D~= Dr*. LEMMA 18.3 Let v and D be the variance and the dispersion function o f a random variable X respectively. Then (13.2) Vv-_~ V(1--a)/2 D(a), 0
1-,(,)=f:_ l, ,' d
_< f_= d
hence O< l< oo. (13.3) ~(1) > 1--2v/l ~, Fix an a, O < a < l , and put l~=D(a). If 11>0 then ~(ll)=a, hence from
(:8.8) a ~
1--~vlD~(a)
which implies (13.2). I f 11=0, (18.2) is trivial. THEOREM 18.1 Assume that there e~ists an a such that Ds~(a)>O f o r all n, and put D~=Ds~(a). Let S~=(~1, S~2, . . . , S~) and let D(S,j) and D ( S ~ + S,~) be the a-dispersions o f S,j and S;,j+ S,~, respectively. Assume that (18.4)
lira
f
dFm(x)=0,
f o r each
~>0.
Then, f o r some sequence {a,] and {b.} the distribution o f ( S , - b , ) l a . converges to a normal distribution, i f and only i f the following (18.5) holds and the limits (13.6) and (1317) exist:
98 (13.5)
KINSAKUTAr.ANO lira ~ f
dF~,(x)=O, for each E>0, Ilxll> eDn
(13.6) (13.7)
Z, - - - D - ~ ' / = 1 , 2 , . . . , ~ 0 , lira D(S,,~+ S,,~) ,, D,, '
], k= 1, 2, ..., p.
The condition (13.4) represents t h a t X~dD~, 1 ~ l ~ l , , converge to 0 in probability uniformly in 1 as n--~ ~ , i.e., that each t e r m of X~ is asymptotically individually negligible with respect to the dispersion of the sum Sn=X~l+... +X~z~. Under (13.4), (13.5) is equivalent to the condition that max[X.~ I/D~ converges to 0 in probability as n-~ ~ , i.e., l
that the greatest t e r m of X~, 1 ~ l ~ l~, is asymptotically negligible with respect to the dispersion of the sum (cf. section 11). PROOF: Assume that there exist {a.} and {b~] such that the distribution of (S~-b~)/a~ converges to a normal distribution. Then so do the distribution of (S~--bn)/Dn by Corollary 2 to Theorem 5.6. Moreover, X, dD~, 1 ~ 1 ~ l~, converge to 0 in probability uniformly in 1 by (13.4). Therefore (13.5) holds from Corollary 2 to Theorem 11.1. Now w r i t e b~=(b~, b~2, . . . , b.~). Then the distributions of real random variables (S~-b~j)/D~ and {(S,~+S~k)-(b~+b~)}/D~ converge as n - ~ r hence, the limits (13.6) and (13.7) exist. Conversely, assume (13.5) and the existence of the limits (13.6)and (13.7). Since (13.5) implies lira inf {e+ ~ f
dF,~(x)}---O, II zll;)eZ)n
there exists a sequence {e.} such that e. ~ 0 and (13.8)
~ f J
d F,,,(x)=8,~ ( s a y ) ~ 0. II z I1>~ Dn
Define .,~ by
l0 t
and put S ~ = X ~ , + X s
+XS~.
Then we have
(13.9) hence, by Lemma 13.1, {S.} and {S~'} are equivalent with respect to
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
99
the convergence in distribution. Let D~ be the dispersion function of S~. Then from Lemma 13.2 we have
D~(a+2p~,)~_D,(a)=D,, if a + 2 p S , < l . Fix an s such that a
D~(d) :> D~(a + 2 ~ , ) _~ D,. Put 1/'v(S~)=s,.
By Lemma 13.8 s, ~ 1/(1--s
hence, (13.1o)
n~(a'),
8~ __ ~/(1- a')/2 D., Moreover, put M~--max sup IIZ ~ ( ~ ) - E X'II.
for sufficiently large n.
Then (13.11) M, ~ 2~,/9,. (13.10), (13.11) and lim e , = 0 imply lira (M.Is.) = O. From Corollary 1 to Theorem 12.1 any subsequence of the distributions of (S~'--E S~)/s, has a subsequence converging to a normal distribution with mean vector 0. Therefore, by (13.9) the case is the same with (S,-ES~)/s,, and hence, also with (S,,-b,,)/D,, by Corollary 2 to Theorem 5.6 where b,=ES~. We shall show t h a t the limit distribution does not depend on subsequences. Assume t h a t the distribution of (S,,c,)-b,,c,))/D,,.,) converges to a normal distribution N(0, a) as n ( i ) ~ oo. Then the distribution of (S,,c~)j-b,<,)~)/D.c~~ converges to N(0, ajj), hence by Lemma 4.4 the limit lira D(S'~')~) =d D<,) exists, where d is the a-dispersion of N(0, az). From the existence of (13.6) d, the a-dispersion of N(0, aji), does not depend on the subsequence. Hence, a z is independent of the subsequence for each j - 1 , 2 , . . . , lo. Next, the distribution of ~(S,<,)~+S,<~)k)-(b,,c~)~+b,<~)~,)J/D,,(~)convergesto N(O, aj~+2a~k+a~), this together with the existence of (13.7) implies t h a t a3j+2a~.~+a~ does not depend on the subsequence, hence aik does not depend on the subsequence. A f t e r all, a is independent of the
i00
KINSAKU TAKANO
subsequence,-and the distribution of (S.--b.)/D. converges to the normal distribution N(0, a). COROLLARY Assume that each term o f X~, 1 ~ 1 <_1,, is asymptotically individually negligible with respect to the dispersion of the sum S , = X ~ I § . . . + Xm, and that the class o f the distribution o f S , converges to a class, Then the limiting class is normal i f and only i f the greatest term o f X~, 1 ~ l <_%l,, is asymptotically negligible with respect to the dispersion
of X.. 14.
Feller's criterion and the attraction domains of normal distributions In this section we want to generalize some results of W. Feller [7]
to the multi-dimensional case. THEOREM 14.1 Suppose that 0 is a median vector f o r each X~z. Then {Xm} obeys the central limit theorem i f and only i f there exists a sequence of positive numbers {,~,} such that (14.1)
lim~f
dFm(x)=O, Izl> h n
lira 8 8
(14.2)
Ixl'dr.,(,)= ~ l=l~ 3,,n
and the limits
(14.3)+lim~,( f n
an.
~,a,,dF,.,,(~,)-f +,,.dF,,,C:+) f ~:+,dr,,(,))=,,..+(say),
I ',,. Izl~;~n+
l=l~3,,a
Izl<:~n
j, k = l , 2, . .., p. exist, where
{f ++"'+"+'+)-f I=I~ +~n
I'~I-.<;+,a
In this ease the distribution of ( S , - b , ) / a , converges to N(0, +)where a=(a~) is defined by (14.3) b, is defined by l=l~),.n
Note that (14.3) and (14.4) imply that
(14.6)
:~ ajj = I.
The theorem holds even i f ' > ' and ' ~ ' are simultaneously replaced by ' ~ ' and ' < ' , respectively. PROOF: To prove the ' i f ' part, assume (14.1)-(14.3) and define a , > 0
ON SOME LIMIT THEOREMS OF PROB~IUTY DISTRIBUTIONS and bn by (14.4) and (14.5), respectively. and t h e fact t h a t 0 is a median we have
By the SehWarz inC~quality
dF.,(|<_~ f Ixl-< An, =j > o
fxl-~ an
101
Wad F,(z).
zj > o
Similary
(f
f
Izl_< An, zj < 0
I=1-
Therefore, since /.
f
d
zj F.,(~) >
0
and
Izl--~an, zj>O
fzl<%n, zj < o
we get 2
1
la:l<'~n
Izl_.~ An
Summing over 3" we obtain
i f x d a'd~') ~<-89f
Izl2dFnt(z)'
this together with (14.4) implies
1Z
(14.7)
f
Ixl~dF,,(~),
From (14.2) and (14.,7) it follows t h a t lira an =
(14.8)
n
00.
~n
Define X~ by
(~) l! X.,(~), 0 ,
if
I x.,(~) I -< a.,
otherwise,
and put s ' = x : 1 + X : , + 999 + x & ,
M,=max Then _Y~,,X', ..., ~ n
sup iX ~ ( ~ ) - E X ~ i.
are independent for each n,
(14.9) Mn ~ 2an, bn=ES', a~=v(S'), and (14.8) means that the covariance matrix of S~/an converges to a. From (14.8) and (14.9) we have (14.10)
lira Mn --0.
102
Kn~sAKU TAKANO
T h e n from Corollary 1 to Theorem 12.1 t h e distribution of (S~-b,,)la, converges to N(0, a). Moreover
Pr (& 4= &') < ~E Pr (X~ 4= X~) = Z Pr ( I X., I > a.) 1
=
l
fI z l > ~'n
this t o g e t h e r with (14.1) implies t h a t lim Pr (S, 4= S~)---0. Therefore, from L e m m a 13.1, the distribution of ( S , - b , ) / a , converges to N(0, a). The asymptotic uniform negligibility of X , z/a~ follows from (14.1) and (14.8). This completes t h e proof of t h e ' i f ' part. To prove t h e 'only i f ' part, assume t h a t {X,~} obeys t h e central limit theorem. Then from Corollary 1 to Theorem 11.1 we have for some sequence of positive numbers {a.} and for each e > 0
laH>t ~n
l',2
(14.12)
f I.~l~Eh
Izl.~ e an.
f I*l<'~a,a
3, k = l , 2, . . . , ~ . We may assume that ~-]a~----1 without loss of generality by Theorem J
5.2. From (14.11) and (14.12), for each positive integer m there exists an nm such that, for all n >_ n,~, ~f
dF,,(x) < 1 I~ I>
anlm
< 1 _ , j, k = l , 2, ..., p. m
I t may be assumed t h a t 751"~n2~8~''"
Define a., for n ~ nl, as follows : (14.13)
a.-- aM
for
~75
Then if
n _> 75~, [ x f > ~'n
n~ ~ n < n~+,.
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
Iz IN ;~n
Izlag ; ~
108
Dtl< ~.n
Hence, we have (14.1) and Izl~ ~Xn
I=I~ i~n
Izl~ *Xn
,~, k = l ,
Sum (14.14) with j = k from j = l to j = ~ .
2, ...,
~D.
Then 2~]aj~=l being assumed,
a~
(14.15)
lkn -~ = 1 , n
a g2
where a~,, is defined by (14.4). From (14.18) lira (arian)= o% this together with (14.15) implies lim(a~/a~)=~o, hence we have (14.2). From (14.14) and (14.15) it follows t h a t (14.8) holds. Thus we have deduced (14.1)-(14.8). It is possible to derive the ' i f ' part from Corollary 1 to Theorem 11.1. The proof proceeds as follows. Assume (14.1)-(14.8) and define aN > 0 and bn by (14.4) and (14.5), respectively. It is sufficient to prove (11.18)-(11.15) with V = {~ ; I x[ < 1} from these assumptions. From (14.1) and (14.8) it follows that (11.18) holds for each , > 0 . From (14.8), am >aN for sufficiently large n. Write bn=(bnl, b,,~, . . . , b,,~,) where b~ is defined by (14.5). Then, assuming aN > an, we have I x [ < ~,~
~a
Izl> ~n
this together with (14.1) implies (11.14) with V = {x; n such t h a t aN >aN
;~
I~1< lt.
Now for
Izl> An
and a~ t k.I _
_
1 ~
~
(f
f~ n < l ~ l < ~
KINSAKU TAKANO
104
Izl> 1~
these together with (14.1) imply (14.16)
_izrr
f .,dF.,f
(11.15) with V = { x ; l x l < l } follows from (14.16) and (14.3). Thus the proof is completed. We shall need the following LEMMA 14.1 Let X1, ~ , . . . be independent random variables with the same distribution funotion F and assume that f o r some sequences {a~} and {bn}, the distribution (XI+ .... + X , - b n ) / a ~ converges to a normal distribution N(0, a). Then we have (14.17) lim an----~ , lira (an+Ja,)=l. PROOF: Take an a such t h a t D~.(a)>O and denote by D~ the a-dispersion of X ~ + . . . + X ~ ( n = l , 2, . . . ) . Then we have (14.18) 0 < D~ < D~ < . - - , (14.19) lim (Dn/a,,)----D, 0 < D < ~, where D is the a-dispersipn of N(0 a). From (14.18) it follows that lim D , = co or 0 < ' l i m D , < co exists. If the latter occurs then from (14.19) finite lim a n = a exists, 0 < a < co, and frnm Theorem 5.2 the distribution of X ~ + . . . + X n - b n converges to N(O,.a'a). Le%~p and @ be the characteristic functions of F and N(0, a2a). Then we have tim ~n(t) e-'bn'' -~--~(t), hence, f
lim I q,(t) I~ = I @(t) I= 1
1,
if I (t) l = 1 ,
O,
otherwise,
which contradicts the fact that @(t)=exp -
tea
. Therefore it must
hold t h a t l i m D , = ~ o , and this together with (14.19) implies t h a t lira a,,= co. Now, since the distribution of (XI+--.--FX,,+~-bn+~)/a,,+~ converges to N(0i a) and X,,~.~/an+~ converges to 0 in probability, the distribution of (X~+--. +X,,-bn+~)/a~,+~ converges to N(0, a) (see H. Cram~r [2], p. 254). As the distribution of (X~ + . . . +X,-b,,)/a,~ converges also to N(0, a) it follows that lim (an+Jan)----1 from Theorem 5.4.
ON SOME LIMIT THEORE1WS O F PROBABILITY DISTRIBUTIONS
106
When the hypothesis of Lemma 14.1 holds, F is said to belong to the attraction domain o f the normal distribution N (0, ,7). THEOREM 14.2 Let F be a distribution function with the vani~ing median vector and let N(0, a) be a normalized normal distribution, aj~= 1. Then F belongs to the attraction domain o f N (0, ~) i f and only the following conditions hold:
u~ f (14.20)
lira
d F(x)
i,l>,
~ f
=0,
Ixl~,dF(x) tzl:Ku
(14.21)
.lim . . . %(u) v(u)
_
a~,,
j, k=l,
where
i Izl~u
Izl_~u
(14.2.)
2, . . . , p,
v(.): f ,.,'dFr
Izl~u
f
Izl~',..
Izl~u
Let X be any random variable with distribution function F, and define X~ by X~(~)=~X(~), if ] X ( ~ ) , ~ u, lo, otherwise. Then %(u), 3", k = l , 2, . . . , p, and v(u) are the second order central moments and the variance of X~, and %(u)/v(u)are the second order central moments of the normalized variable ( X ~ - E X,)/1/~X~). PROOF OF THEOREM 14.2 To ~arove the "if' part, it is sufficient, by Theorem 14.1, to deduce from (14.20) the existence of a sequence of positive numbers {)~} such that lira n f
d F(x) = O, lxl> ~n
(14.24)
f ]=1< ~n
lira ,~,,=
+o.
If for some ~to
f
dF(~)=O,
Izl>%
106
KINSAKU TAKANO
t h e n a sequence {a,} satisfying (14.24) is obtained by ~ ----n l / d .
~ n e r e f o r e we m a y a s s u m e t h a t for all u > 0 d F ( x ) > 0. Izl>u
For each positive i n t e g e r m, define ~ . by
Then w e have lim ~---- ~ ,
(14.25)
2 F r o m (14.20), (14.25) and (14.26) we h a v e
]im- f "+'~ pb~ I"=1-<~
Hence we can choose a sequence of positive n u m b e r s nl
np f
I 9 I~dF(x) ~ ~2,
for
n _~ n~.
Define ,1,, by a,,=/~,,~
for
np ~ n < np+l.
T h e n we have lira a~= o%
n f
Ixl~dF(x)>_p
nf
dF(x)~ 1 Izl>~n
fo~ n z , ' < n < n , . , ,
P
hence, (14.24) holds Conversely, a s s u m e t h a t F belongs to t h e a t t r a c t i o n domain of N(0, a) and let {a~} be a s e q u e n c e which satisfies t h e condition of L e m m a 14.1. Then f r o m Corollary 1 to Theorem 11.1 w e have
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS
limnf
(14.27)
107
dF(z)=O,
I~I> on
(14.28)
lira n a ~ , ( a , , ) = ~ ,
j', k---l, 2, . . . , p.
From (14.28) we have (14.29)
lim nv(a,,) _ ~ aj~= 1.
This together with (14.27) implies a~'fi,j> d F(x) lim
--0,
v(a,,)
hence, a~ f [.~l>dnr(T')
(14.30)
lim
-- 0. f
[~lz d F ( x )
Iz[_~an
From (14.17) there exists N such that an+l < 2 a~ for all n ~ N. We may assume that a,,--- D,,/ D (see the proof of Lemma 14.1), hence t h a t a~ < a~+1 for all n. Let {u~} be an arbitrary sequence such t h a t (14.31) a~ ~ u~ ~ a~+l. Then for n ~ N, as an _< u~ < 2 an, we have
f,
I~1~d F(z) .~l_.
f
l [x[ZdF(x)' zl < a,a
this together with (14.80) implies lim -- f
;.~r>*,,,. ----O. Ixl'dF(x)
Since us is arbitrary except for (14.81) and since lira a,,= 0o, we have (14.20). From (14.28) and (14.29) we have (14.32)
lim ,,-.~ ~,(a~) v(a.) -:a~*'
j, k = 1, 2, . . . , p.
108
KINSAKU TAKANO
From (14.29) it follows that the sequence (14.33)
t nv(a,,)' n = l , 2 , . . . t is bounded.
From (14.27), (14.33) and am ~ un < 2 an it can be proved that (14.34)
lira (r n vCa.)
(See the proof of (14.16)).
aj,(a~))=O" v(a.)
(14.32) and (14.34) imply
lira ajk(u~) :r
v(a.)
j, k= 1, 2, . . . , p.
Sum these with 3"= k from j : 1 to j = p , then lira v(un) =1. v(an) From the last two equations it follows t h a t lira aj~(u~) :a~, j, k : l , 2, . . . , p, v(un) hence, (14.21) holds. THEOREM 14.3 I f F belongs to the attraction domain of a normalized normal distribution N(0, a), then for each a such that 0 < a < : 2 , the a-th absolute moment of F is finite:
f
Ix]~dF(x)<~o, %, hence, the mean vector of F
m=f
xdF(x)
is well defined and the distribution of (X1 § § X ~ - nm)/a~ converges to N(0, a) for a sequence {an}. The proof runs in the same way as in the one-dimensional case. 15.
R e d u c t i o n to t h e o n e < l i m e n s i o n a l ease
By now, in this part, the various versions of the multi-dimensional central limit theorem have been studied from Theorem 11.1, which was proved from the general convergence theorem on the infinitely divisible
ON SOME L I M I T T H E O R E M S OF P R O B A B I L I T Y DISTRIBUTIONS
109
multi-dimensional distributions. In this section we wish to show t h a t Theorem 11.1 in the multi-dimensional case can be reduced to that iti the one-d~mensional case. First l e t us notice the following fact. LEMMA 15.1 Let V, and V2 be two neighborhoods o f the oriyii,. Then under (11.2), (11.3) and (11.4) with V = V , are aluivalent to (11.8) and (11.4) with IT= V~, reslaeatively. PROOF: This follows from the .following facts:
(15.1)
,iT ( ~ f . d F . ( . ) - ~ f 9 dF~(.))= 0,
05.2)
lira
v:
v~
[N(f"~'dF'(~)-f ~,d~(~)f ..dF~(~)) r1
-u
vl
vl
.~.d F~(~)- f ~,d P~(.) f ..d F.(.))I = 0, v2
v~
v2
j,k---l, 2, ..., ~.
To prove these, note that there exist positive numbers ~ and ~ such that 8(0, ~) C V'j C 8(0, 8) for j = 1, 2. Now v1
v,~
v l ~ Ir2c
V~ ,-,vie
l=l>e
and this together with (11.2) implies (15.1).
v,
v~
rs
Similary
r~
= ~[(f~dF.-f~,d~'.,)f~.dFi,+f~,d~'.,(f..dF.,-f*,dF.,)] r VI
I'~
I'Tl
l*l>e
and these togel;her with (11.2) imply (15.2).
VI
V2
I
110
KmSAKU TAKANO
The following lemma is useful for the reduction of the multidimensional central limit theorem to the one-dimeusional case. LEMMA 15.2 Let {~} be a sequence o f p-dimensional random variables. Then the distribution o f X . converges to a normal distribution N(0, a), i f and only i f f o r each t E R~ the distribution o f t'X, converges to the distribution N(0, g6t). This is well-known and is easily proved by making use of the characteristic functions. Notice t h a t if t'at=O, N(0, t'at)denotes the unit distribution which has the whole probability 1 placed in the origin. Now, in the one-dimensional case, Theorem 11.1 becomes THEOREM 11.1' Let ~.~, ~.~,. . . , ~. be independent real random variables with distribution functions I-~,, I-~2, . . . , I-~. f o r eachpositive integer n, let {/3.} be a sequence o f real numbers, and let v be a non-negative real numbers. Furthermore assume that lira
max
f
dtt.~(~) =0,
f o r each
~ > O.
Iffil:>e
Then the distribution o f $,x + $.~ + ' " + ~ , , . - B , converges to the distribution N(0, v), i f and only i f the following three conditions hold: Isl>~
I~1<1
Izl < t
I=1<
If v > 0, this theorem is a slightly modified form of a well-known version of the one-dimensional central limit theorem (see W. Feller [6J, Satz 1), for the distribution of ~ , t + " " + ~ - B , converges to N(0, v) if and only if the distribution of ( ~ . ~ + . - - + ~ - B . ) / V ' v converges to N(0, 1). If v = 0 Theorem 11.1' becomes a version of the law of large numbers. LEMMA i5.3 I f (11.1) holds, then f o r each t ~ Rp and f o r each, > 0 It'zl>
This is obvious from
f dF.(.)<_f aF.,(.), Wzl>*
Izt~_~/l*l
for t:~O,
ON SOME LIMIT THEOREMS
OF PROBABILITY
DISTRIBUTIONS
which follows from I t~x I < I t l" I x lNow we can reduce Theorem 11.1 to Theorem 11.1'. from Lemma 15.2, 15.3, and the following LEMMA 15.4
l~m~ f
(15.8)
(154)
l~(xf
i dF.(~)--o,
for ~ h
This follows
.>0,
e, dF,a,)-ebo)=O. It'zl< t
(15.5)
l~u
(,'.),dF~(~)-(f ,'.dF,,(.))']=,'.,, Jt*zl< 1
I/'x[< 1
hold for each t e Rp, i f and only i f (11.2)-(11.43 hold. To prove the ' i f ' part assume (11.2)-(11.4) and fix a ~ ~F0 (if t = 0 , (15.3)-(15.5) are trivial). Then (15.3)follows from (11.2)and the following inequality : lt*xr~_ e
]zl> t/l,l
W e m a y assume t h a t V = [.~ ; I x I < 111 t I}.
Now
J=t:=l"< 1
I=1< 1/ltl
l='=l< 1, I=I:> 'J/It.I
Iz.l< l/Jtf
w h e r e l a l---1, Furthermore
l~'zl < l
=u
l = l > 1/1=1
and this t o g e t h e r with (11.3) and (11.2) implies (15.43.
lt'z I < l
(,,.),dF.(.)-(f
l z l < 111=J
+x{ f, (,'.),dF~r l$,zll/l$1
e~d~.(~))'} l * l < 11Itl 9
2
e,d F.(~)) -2f t',d F~(,)f t'~dF.(~)}
y ' * l < l , l=1>I/151
l=l
= TI-I-T, (say). T, tends to t'at by (11.4), and T~ tends to 0 as
le%ll/Itl
112
KIr~s~LK'UTAKANO
_< fIzl~; l/ItJ T h e r e f o r e (15.5) holds. To prove t h e ' o n l y i f ' part, assume t h a t (15.3)-(15.5)hold f o r each t ~ Bp. Then (11.2) follows •rom (15.3) and t h e following i n e q u a l i t y :
zf Write b.=(b..,
b,,~, .... b,,~).
Then for each j, we have
Izl
I%t1< I, Izl~.I
l=jl<1
=(
)+ozf
,,
lffiI> t
= u, + u , (say). UI t e n d s to 0 by (15.4), U2 t e n d s to 0 by (11.2), and hence (11.3) w i t h V = { x ; I x l < 1} holds. F u r t h e r m o r e f r o m (11.2) and (15.5) it follows that
Izl'< 1
IZl
(see the proof of Lemma 15,1). V = {x : Ix I < 1} must hold.
Since this holds for each t, (11.4) with
REFERENCES [1]" H. C r a m d r , R a n d o m variables and probability distributions, C a m b r i d g e T r a c t s in Math. no. 36 (1937). [2] H. Cramdr, Mathematical methods of. statistics, Princeton Univ. Press (1946). [3] W. Doeblin, Sur l'ensemble de puissances d'une loi de probabilitY, St.udga Math. 9, 71-96 (1940). [4] J. L. Doob, Stochastic Processes, John Wiley and Sons (19.53). [5] C. G. Essen, Fourier analysis of distribution functions, Acta Math. 77, 1-125 (1945). [6] W. Feller, Uber den zentralen Grentzwertzatz der Wahrscheinlichkeitzrechnung, Math. Ze/t. 40, 521-559 (1935). [7] W. Feller, Uber den zentralen Grentzwertzatz der Wahrscheinlichkeitzrechnung II, Math. Ze/t. 42, 301-312 (1937). [ 8 ] W. Feller, The fundamental limit theorems in probability, Bull. Amer. Math. Soc. 51, 800-832 (1945).
ON SOME LIMIT THEOREMS OF PROBABILITY DISTRIBUTIONS [ 9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
113
B. Gnedenko, Sur la distribution limit du terme maximum d'une serie al~atoire, Ann. Math. 44, 423-453 (1943). W. Hoeffding and H. Robbins, The central limit theorem for dependent random variables, Duke Math. Jo~'. 15, 773-780 (1948). Kiyosi Ito, The theory o f probability (in Japanese), Iwanami, Tokyo (1953). Yukiyoshi Kawada, The theory o f probability (in Japanese), 4th ed., Kyoritsusha, Tokyo (1952). A. Khintchine, D~duction nouvelle d'une formule de M. Paul L6vy, Bull. Univ. d'Etat Moscou, S~'. Int~rnat., Sect. A. Math. e~ Mica. I, 1-5 (1937). A. Khintchine, Ueber Klassenkonvergenz yon Verteilungsgesetzen, Izvestiya Nau~zoIssled. Inst. Math. u. Mech. Univ. Tomsk, 1, 258-261 (1937). Kiyonori Kunisawa, On an analytical method in the theory of independent random variables, Ann. Inst. Statist. Math. Tokyo, 1, 1-77 (1949). P. L4vy, Th~oris de l'addition des variables al~atoire, Gautier-Villars (1937). M. Lo4ve, On sets of probability laws and their limit elements, Univ. Calif. PubL Statist. 1, 53-87 (1950). M. Lo6ve, Fundamental limit theorems of probability theory, Ann. Math. Star. 21, 321-338 (1950). N. A. Sapogov, On a multidimensional limit theorem of the theory of probability (in Russian), Uspehi Matem. N a u k (N.S.), voh 5, no. 3 (1950). Kinsaku Takano, On the many-dimensional distribution functions, Ann. Inst. Statist. Math. Tokyo, 5, 41-58 (1953).