Probab. Theory Relat. Fields 116, 457–484 (2000)
c Springer-Verlag 2000
Shigeo Kusuoka · Nakahiro Yoshida
Malliavin calculus, geometric mixing, and expansion of diffusion functionals Received: 7 September 1997 / Revised version: 17 March 1999 Abstract. Under geometric mixing condition, we presented asymptotic expansion of the distribution of an additive functional of a Markov or an -Markov process with finite autoregression including Markov type semimartingales and time series models with discrete time parameter. The emphasis is put on the use of the Malliavin calculus in place of the conditional type Cram´er condition, whose verification is in most case not easy for continuous time processes without such an infinite dimensional approach. In the second part, by means of the perturbation method and the operational calculus, we proved the geometric mixing property for non-symmetric diffusion processes, and presented a sufficient condition which is easily checked in practice. Accordingly, we obtained asymptotic expansion of diffusion functionals and proved the validity of it under mild conditions, e.g., without the strong contractivity condition.
1. Introduction In the asymptotic statistical theory, after studies of the first-order asymptotics, the asymptotic expansion is a promising tool to investigate the higher-order performance of statistics used for statistical inference, and thorough investigations have been made mainly for independent cases; see e.g. the monograph by Ghosh [6]. As for dependent data, the work of G¨otze and Hipp [7] was a breakthrough: they gave an asymptotic expansion of the distribution of an additive functional of a discrete-time process under the geometric mixing condition and a conditional type of Cram´er condition. To execute their program, checking the conditional type of Cram´er condition is not a simple matter, and they successively in [8], presented sufficient conditions for time series models. The reason of the difficulty is that it is nothing but the problem of regularity of the distribution of a random variable, and it is in many cases a difficult problem, unlike in independent observation cases, to prove the regularity of the distribution from a structural definition of the random variable, e.g., a solution of a stochastic difference/differential equation. Here we are aiming at expansions for stochastic processes with continuous-time parameter such as semimartingales. In this case, the regularity part inevitably requires an infinite-dimensional argument, and really, it is possible if we make use of the S. Kusuoka, N. Yoshida: Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan. Mathematics Subject Classification (1991): 60H07, 60F05, 60J25, 62E20 Key words and phrases: Malliavin calculus – mixing – asymptotic expansion – -Markov process – diffusion – semimartingale
458
S. Kusuoka, N. Yoshida
Malliavin calculus over the Wiener space or, if necessary, that over more abstract space including the Wiener-Poisson space to treat jump processes. In the first part, as an underlying process with geometric mixing condition, we will deal with a somewhat abstract -Markov process driven by another process with independent increments, and present an asymptotic expansion of the distribution of an additive functional defined with those processes; we indeed encounter such functionals in most statistical applications. By considering an -Markov process, more general than a Markov process, it is possible to treat time series models with discrete time parameter together in a unified framework because the adopted general Malliavin calculus by Bichteler, Gravereaux and Jacod [2] is available even for such processes. The conditional type Cram´er condition is replaced by the nondegeneracy condition of the Malliavin covariance of the functional, and it can be verified for practical models appearing in applications; indeed, we can apply the H¨ormander condition to diffusion models, and a set of conditions in Bichteler et al. [2] to a stochastic differential equation with jumps. In order to apply the first part, it is necessary to verify the naive geometric mixing condition. In the second part, we will confine our attention to diffusion processes, and present a sufficient condition which is easily checked by looking at the coefficient vector fields of the stochastic differential equation. It has been known that the geometric mixing condition holds for certain symmetric diffusions, cf. Stroock [14], Doukham [4], Roberts and Tweedie [11]. If one considers a symmetric diffusion process, then by using properties of a compact self-adjoint operator, the mixing condition is obtained because of the existence of the spectral gap. For nonsymmetric diffusions, we cannot follow this plot, but by using the perturbation method and the operational calculus, we can still prove the geometric mixing property. The reader will observe that the Malliavin calculus (or hypoellipticity argument) also works implicitly in the fundamental level of our discussion in the second part. Finally, combining the first and second parts, with the help of a result at hand on the nondegeneracy of the Malliavin covariance of the diffusion process, we will provide a sufficient condition which is easy to verify since it has replaced the original two technically difficult conditions, i.e., the geometric mixing condition and the conditional type Cram´er condition, by an easily checked condition written with a dual generator, and a nondegeneracy condition of the Lie algebra of vector fields. The organization of the present article is as follows. In Section 2, we will give the definition of the -Markov model and examples. Section 3 presents fundamentals of the Malliavin calculus for jump processes. In Section 4, under the geometric mixing condition, we will present asymptotic expansion for functionals of -Markov processes in two cases with different Malliavin operators. Also as examples, we discuss applications to an ARMA(p,q) process and a semimartingale satisfying a stochastic differential equation with jumps. In Section 5, we will confine our attention to diffusion processes, and provide a result on the geometric mixing property under a set of mild, easily verifiable conditions. It should be noted that we there treat general non-symmetric diffusion processes, and certain functional analytic techniques are used in the proof. After that, we will present the asymptotic
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
459
expansion of diffusion functionals by combining the results there and in Section 4. In Section 6, the expansion for a functional having a stochastic expansion will be presented. Most of statistics have such a stochastic expansion; thus it provides us with a basis of higher-order statistical inference for stochastic processes. Finally in Section 7, we will give proofs of our results. 2. -Markov model In order to treat generalized Markov chains with discrete time parameter and Markov processes with continuous time parameter in a unified way, we will consider the following -Markov process. Given a probability space (, F, P ), let Y = (Yt )t∈R+ : × R+ → Rd2 denote a cadlag process (or a separable process), and X = (Xt )t∈R+ a d1 -dimensional X,Y dX is independent of B[r,∞) cadlag process with independent increments, i.e., B[0,r] for r ∈ R+ , where X,Y = σ [Xu , Yu : u ∈ [0, r]] ∨ N B[0,r]
and BIdX = σ [Xt −Xs : s, t ∈ I ∩R+ ]∨N, I ⊂ R, N being the σ -field generated by null sets. Define sub σ -fields BIY , BI of F by BIY = σ [Yt : t ∈ I ∩ R+ ] ∨ N and by BI = σ [Xt − Xs , Yt : s, t ∈ I ∩ R+ ] ∨ N. Assume that, for some fixed ≥ 0, the process Y is an -Markov process driven by X; more precisely, we assume that for an ∈ R+ , Y dX ∨ B[s,t] Yt ∈ F B[s−,s] dX . Y Y ⊂ B[s−,s] ∨ B[s,t] for ≤ s ≤ t. Clearly, B[s−,t] In this paper, we are interested in the asymptotic expansion of the distribution of the normalized additive functional T −1/2 ZT , where Z = (Zt )t∈R+ is an Rd -valued process satisfying Z0 ∈ FB[0] and
Zts := Zt − Zs ∈ FB[s,t] for every s, t ∈ R+ , 0 ≤ s ≤ t. For a sub σ -field G of F, BG denotes the set of all bounded G-measurable functions. In order to derive asymptotic expansions, we will consider the situation where the following two conditions hold true: [A1] There exists a positive constant a such that kPBY
[s−,s]
[f ] − P [f ]kL1 (P ) ≤ a −1 e−a(t−s) kf k∞
Y for any s, t ∈ R+ , s ≤ t, and for any f ∈ BB[t,∞) . t [A2] For any 1 > 0, supt∈R+ ,0≤h≤1 kZt+h kLp (P ) < ∞ for any p > 1, and t ] = 0. Moreover, Z0 ∈ ∩p>1 Lp (P ) and P [Z0 ] = 0. P [Zt+1
460
S. Kusuoka, N. Yoshida
Example 1. Let {Yn }n∈Z+ be an m-Markov chain (non-linear time series model) taking values in Rd2 satisfying the stochastic equation Yn = Sn (Yn−1 , . . . , Yn−m , ξn ), n ≥ m ,
(1)
d1 sequence taking values where {ξn }n≥m is an independent Pn Pn in R and independent m−1 of {Yn }n=0 . Let Zn = j =1 fj (Yj , ξj ) and Xn = j =1 ξj . Clearly, it is possible to embed the process {Xn , Yn , Zn }n∈Z+ into a process {Xt , Yt , Zt }t∈R+ with continuous time parameter as Xt = X[t] , Yt = Y[t] and Zt = Z[t] . Then Y is an (m − 1)-Markov process driven by the process X with independent increments.
Example 2. Let us consider a stochastic process {Yt , Zt }t∈R+ defined as a strong solution of the following stochastic integral equation with jumps: Yt = Y0 + A(Y− ) ∗ t + B(Y− ) ∗ wt + C(Y− ) ∗ µ˜ t Zt = Z0 + A0 (Y− ) ∗ t + B 0 (Y− ) ∗ wt + C 0 (Y− ) ∗ µ˜ t ,
(2)
where Z0 is σ [Y0 ]-measurable, A ∈ C ∞ (Rd2 ; Rd2 ), B ∈ C ∞ (Rd2 ; Rd2 ⊗Rm ), C ∈ C ∞ (Rd2 × E; Rd2 ), and similarly, A0 ∈ C ∞ (Rd2 ; Rd ), B 0 ∈ C ∞ (Rd2 ; Rd ⊗ Rm ), C 0 ∈ C ∞ (Rd2 × E; Rd ), where w is an m-dimensional Wiener process, E is an open set in Rb , and µ˜ is a compensated Poisson random measure on R+ × E with intensity dt ⊗ λ(dx), λ being the Lebesgue measure on E. Under usual regularity conditions, (Yt , Zt ) can be regarded as smooth functionals over the canonical space = {(y0 , w, µ)}, where µ denotes the integer-valued random measure on R+ × E. For details, see III.6 and IV.10 of Bichteler et al. [2]. Denote by F the σ -field generated by the canonical maps on . The process Xt may in this case be taken as Xt = (wt , µt (gi ); i ∈ N), where (gi ) is a countable measure determining family over E; see Remark 1. In this case, Y is a Markov process, i.e., = 0, driven by X with independent increments. 3. Malliavin calculus To ensure the regularity of distributions, we will use the nondegeneracy of the Malliavin covariance in place of the conditional type Cram´er condition. We here adopted the formulation of the Malliavin calculus by Bichteler et al. [2] in view of semimartingales with jumps. Let (, B, 5) be a probability space. A linear operator L on D(L) ⊂ ∩p>1 Lp (5) into ∩p>1 Lp (5) is called a Malliavin operator if the following conditions are satisfied: B is generated by D(L). For f ∈ C↑2 (Rn ), n ∈ N, and F ∈ D(L)n , f ◦ F ∈ D(L). For any F, G ∈ D(L), E 5 [F LG] = E 5 [GLF ]. For F ∈ D(L), L(F 2 ) ≥ 2F LF . In other words, the bilinear operator 0 on D(L) × D(L) associated with L by 0(F, G) = L(F G) − F LG − GLF is nonnegative definite. (5) For F = (F 1 , . . . , F n ) ∈ D(L)n , n ∈ N, and f ∈ C↑2 (Rn ),
(1) (2) (3) (4)
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
L(f ◦ F ) =
n X
∂i f ◦ F LF i +
i=1
461
n 1 X ∂i ∂j f ◦ F 0(F i , F j ) . 2 i,j =1
Fix a Malliavin operator (L, D(L)). For p ≥ 2, define kF kD2,p by 1
kF kD2,p = kF kp + kLF kp + k0 2 (F, F )kp . Let D2,p denote the completion of D(L) with respect to k · kD2,p . Then (D2,p , k · kD2,p ) is a Banach space, and there are inclusions: D2,p ⊂ Lp ∪ ∪ D2,q ⊂ Lq for 2 ≤ p ≤ q. The existence of a Malliavin operator leads us to the existence of an integration-by-parts setting (IBPS). Let D2,∞− = ∩p≥2 D2,p . Then by Theorem 8-18 of [2] p. 107, we have the following IBP formula (with truncation). Proposition 1. (1) L is extended uniquely to an operator (say L) on D2,∞− , and the operator (L, D2,∞− ) is a Malliavin operator. In particular, D2,∞− is an algebra. (2) There exists an IBPS: for f ∈ C↑2 (Rd ), F ∈ D2,∞− (Rd ) ≡ (D2,∞− )d and ψ ∈ D2,∞− , # " d i h X i,j j 5 ∂i f (F )σF ψ = E 5 f (F )TF (ψ) E i=1
for j = 1, · · · , d, where and
i,j
σF = 0(F i , F j ) , j
TF (ψ) = −2ψLF j − 0(ψ, F j ) . i,j
(3) Let 1 ≡ 1F = det σF , σF = (σF )di,j =1 . σ[i,i 0 ] denotes the (i, i 0 )-cofactor of σF . Suppose that F ∈ D2,∞− (Rd ) and that 1 · 1−1 ψ = ψ a.s., i.e., 1 = 0 ⇒ 1−1 ψ = 0 a.s.: this implicitly means that ψ = 0 a.s. on {1 = 0} since 1−1 = ∞ i,j 2 and 1−1 ψ ∈ D2,∞− , then for f ∈ C↑2 (Rd ), on it. If σF ∈ D∞− i h E 5 [∂i f (F )ψ] = E 5 f (F )JiF ψ , ¯ such that 1−1 ψ ∈ D2,∞− } → where the operator JiF : {ψ : 2 → R ∩p>1 Lp (5) is defined by JiF ψ =
d X i 0 =1
=−
0
TFi (1−1 ψσ[i,i 0 ] ) o 0 0 21−1 ψσ[i,i 0 ] LF i + 0 1−1 ψσ[i,i 0 ] , F i .
d n X i 0 =1
462
S. Kusuoka, N. Yoshida
For k ∈ N, define Sk0 [F ] and Sk00 [ψ] as follows: i,j
S10 [F ] := {σF : i, j = 1, . . . , d} if F ∈ D2,∞− (Rd ); i,j
0 0 Sk0 [F ] := {σF , LF i , Sk−1 [F ], 0(Sk−1 [F ], F i ) : i, j = 1, . . . , d} if F ∈ 0 d D2,∞− (R ) and Sk−1 [F ] ⊂ D2,∞− ; S100 [ψ; F ] := {1−1 ψ} if 1 = 0 implies 1−1 ψ = 0; 00 [ψ; F ], 1−1 0(S 00 [ψ; F ], F i ) : i = 1, . . . , d} if Sk00 [ψ; F ] := {1−1 Sk−1 k−1 00 [ψ; F ] ⊂ D −1 00 −1 Sk−1 2,∞− and if 1 = 0 implies 1 Sk−1 [ψ; F ] ∪ 1 00 0(Sk−1 [ψ; F ], F ) = {0}. Put S1 [ψ; F ] := S10 [F ] ∪ S100 [ψ; F ] if F ∈ D2,∞− (Rd ) and if 1 = 0 implies 1−1 ψ = 0; 0 [F ] ⊂ Sk [ψ; F ] := Sk−1 [ψ; F ] ∪ Sk0 [F ] ∪ Sk00 [ψ; F ] if F ∈ D2,∞− (Rd ), Sk−1 00 00 −1 D2,∞− and Sk−1 [ψ; F ] ⊂ D2,∞− , and if 1 = 0 implies 1 Sk−1 [ψ; F ] ∪ 00 [ψ; F ], F ) = {0}. Here we denoted 0(A, B) = {0(a, b) : a ∈ A, b ∈ 1−1 0(Sk−1 B} for function sets A and B, and denoted 1F simply by 1.
Proposition 2. Suppose that F ∈ D2,∞− (Rd ). If Sk [ψ; F ] ⊂ D2,∞− , then for f ∈ C↑k+1 (Rd ), i h E 5 ∂i1 ∂i2 . . . ∂ik f (F )ψ = E 5 f (F )JiFk . . . JiF2 JiF1 ψ . 4. Asymptotic expansion for the functional ZT Let τ denote a fixed positive constant satisfying τ > . Suppose that for each T > 0, u(j ) and v(j ) are sequences of real numbers such that ≤ u(1) ≤ u(1) + τ ≤ v(1) ≤ u(2) ≤ u(2) + τ ≤ v(2) ≤ . . ., and that supj,T {v(j ) − u(j )} < ∞. Let Ij = [u(j ) − , u(j )] and Jj = [v(j ) − , v(j )]. Suppose that for each T ∈ R+ , u(j ) n(T ) ∈ N and that v(n(T )) ≤ T . Let Zj = Zv(j ) for j = 1, 2, . . . , n(T ).1 The r-th cumulant χT ,r (u) of T −1/2 ZT is defined by r d log P [exp(iu · T −1/2 ZT )] . χT ,r (u) = d 0 Next, define functions P˜T ,r (u) by the formal Taylor expansion: ! X ∞ ∞ X 1 −1 r−2 χT ,2 (u) + r! χT ,r (u) = exp r T −r/2 P˜T ,r (u) . exp 2 r=2
r=1
ˆ T ,k (u) be the k-th partial sum of the right-hand side of (3) with = 1: Let 9 X k 1 ˆ χT ,2 (u) + 9T ,k (u) = exp T −r/2 P˜T ,r (u) . 2 r=1
1
An abusive use of “Z”: Zj is not ZT at T = j .
(3)
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
463
Finally, for T > 0 and k ∈ N, a signed measure 9T ,k is defined as the Fourier ˆ T ,k (u). In the sequel, we will assume that the second cumulant χT ,2 (u) inversion of 9 converges to a negative definite quadratic form −u0 6u as T → ∞. Fix a symmetric matrix 6 o satisfying 6 < 6 o . Theorem 1 below is rather for processes with finite range dependency than for -Markov processes; Theorem 2 is suitable for them. However, the method used in the proof of Theorem 2 is essentially the same as that of Theorem 1, which is rather simpler than Theorem 2. Another connection is explained in Remark 4 after the proof of Theorem 2 in Section 7. Let F = f (Xuk − Xuk−1 , Yuk ; Xvl − Xvl−1 , Yvl ; 1 ≤ k ≤ m, 1 ≤ l ≤ n) ,
(4)
where u(j ) − ≤ u0 ≤ · · · ≤ um ≤ u(j ), v(j ) − ≤ v0 ≤ · · · ≤ vn ≤ v(j ), m, n ∈ N, and f ∈ CB∞ (R(m+n)(d1 +d2 ) → R). Let (Lj )j =1,2,···,n(T ) be a family of Malliavin operators, each Lj being defined over (, B[u(j )−,v(j )] , P ), and (i) (i) (i) suppose that for every j = 1, 2, . . . , n(T ), Xt − Xu(j )− , Yt ∈ D(Lj ) for t ∈ [u(j ) − , v(j )], hence F ∈ D(Lj ), and suppose that Lj F = 0. The measurable function ψj : (, B[u(j )−,v(j )] ) → ([0, 1], B([0, 1])) denotes a truncation functional. Put −1 kl kl S1,j = {1−1 Zj ψj , σZj , Lj Zj,k , 0Lj (σZj , Zj,m ), 0Lj (1Zj ψj , Zj,l )}
corresponding to Lj . Let E(M, γ ) = {f : Rd → R, measurable, |f (x)| ≤ M(1 + |x|)γ (x ∈ Rd )}. φ(x; µ, 6) is the density function of the normal distribution with mean µ and covariance matrix 6. The sequences {u(j ), v(j )}, {Lj } and {ψj } may depend on T . We will assume [A3] (i) inf j,T P [ψj ] > 0; (ii) lim inf T →∞ n(T )/T > 0; Lj Lj )d , S1 [ψj ; Zj ] ⊂ D2,∞− , and ∪ j =1,...,n(T ) S1,j is bounded in (iii) Zj ∈ (D2,∞− T >0 p L (P ) for any p > 1. Theorem 1. Let k ∈ N, and let M, γ , K > 0. Suppose that Conditions [A1], [A2] and [A3] are satisfied. Then there exist constants δ > 0 and c > 0 such that for f ∈ E(M, γ ), ZT (k) P f ( √ ) − 9T ,k [f ] ≤ cω(f, T −K ) + T , T where Z ω(f, r) = (k)
Rd
sup{|f (x + y) − f (x)| : |y| ≤ r}φ(x; 0, 6 o )dx
and T = o(T −(k+δ)/2 ) uniformly in E(M, γ ).
464
S. Kusuoka, N. Yoshida
4.1. Process with finite autoregression Suppose that a sequence {u(j ), v(j )} is given as before. The process considered here is a process with finite autoregression; more precisely, we assume that for each interval Jj = [v(j ) − , v(j )], there exists a finite number of functionals Yj = {Yj,k }k=1,...,Mj such that σ [Yj ] =: BJ0 j ⊂ BJj and PB[0,v(j )] = PBJ0 j
on BB[v(j ),∞) . For each j , let (Lj , D(Lj )) denote a Malliavin operator over (, B[u(j )−,v(j )] , P ). Here we do not assume that Lj F vanishes for functionals F of the form of (4); contrarily, we will assume that for any f ∈ CB∞ (R(d1 +d2 )m ) and any u0 , u1 , . . . , um satisfying u(j ) − ≤ u0 ≤ u1 ≤ · · · ≤ um ≤ u(j ), the Lj and Lj F = 0. functional F = f (Xuk − Xuk−1 , Yuk : 1 ≤ k ≤ m) ∈ D2,∞− Let σZj be the Malliavin covariance matrix of Zj = (Zj , Yj ), and suppose that pq
L
j , where Zj = (Zj,l ). Suppose sup Mj < ∞. Zj,l , Yj,k , σZj ∈ D2,∞−
j,T
ψj denotes a truncation functional defined on (, B[u(j )−,v(j )] , P ). As before, let −1 kl kl S1,j = {1−1 Zj ψj , σZj , Lj Zj,k , 0Lj (σZj , Zj,m ), 0Lj (1Zj ψj , Zj,l )}
for operator Lj . [A30 ] (i) inf j,T P [ψj ] > 0; (ii) lim inf T →∞ n(T )/T > 0; Lj Lj )d+Mj , S1 [ψj ; Zj ] ⊂ D2,∞− , and ∪ j =1,···,n(T ) S1,j is bounded (iii) Zj ∈ (D2,∞− T >0 p in L (P ) for any p > 1. Theorem 2. Let k ∈ N. Suppose that Conditions [A1], [A2] and [A30 ] are satisfied. Then the same inequality as Theorem 1 holds true. Remark 1. We may take d1 = ∞ if necessary. The proofs do not change except for minor modifications even in this case; thus we can treat Poisson random measures as the input process X. Models in Example 1 and 2 satisfy the finite autoregression condition. Example 10 . (Continuation of Example 1) Assume that the driving process ξt = X˜ t is an Rd1 -valued i.i.d. sequence with smooth density w and Yt is defined by (1) with m = 1. Taking = {(y0 , (xi )i∈N ); y0 ∈ Rd2 , xi ∈ Rd1 } and B[j −1,j ] = σ [Yj −1 , X˜ j ](= σ [Yj −1 , Yj , X˜ j ]), the j-th Malliavin operator Lj over (, B[j −1,j ] , P ) is defined by D(Lj ) = {f = f (Yj −1 (y0 , x1 , . . . , xj −1 ), xj ); f ∈ C↑2 (Rd1 +d2 )} and Lj f =
1 1 ρ(xj )1xj f + w(xj )−1 ∇xj (ρw) · ∇xj f 2 2
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
465
for f ∈ D(Lj ). ρ is an auxiliary smooth positive function. We may use consecutive noises for xj , if necessary. As an example, let us consider an ARMA(p,q) process {Y˜t } which is defined by the equation φ(B)Y˜t = θ(B)X˜ t , t ∈ Z+ , where φ and θ are polynomials : φ(z) = 1 − φ1 z − · · · − φp zp and θ(z) = 1 + θ1 z + · · · + θq zq , and B is the backward shift operator: B Y˜t = Y˜t−1 . It is known that Y˜t has a statespace representation as follows (cf. Brockwell and Davis [3], Chapter 12). Let r = max{p, q + 1} and Yt = (yt−r+1 , yt−r+2 , . . . , yt )0 , and define Yt so that Yt satisfies 0 Ir−1 0 Yt = Yt−1 + r−1 X˜ t , φr φr−1 · · · φ1 1 and Y˜t = θr−1 θr−2 · · · θ0 Yt , where φj = 0 for j > p, θ0 = 1 and θj = 0 for j > q. The driving process P ˜ Xt may in this case be taken as Xt = [t] j =1 Xj . A typical form of Zt in statistical applications is Ztt−1 = ft (Yt , X˜ t ), which is within the present scope. In this example, Y˜ itself is not -Markov but its functional can be dealt with in our context. Example 20 . (Continuation of Example 2) The j-th Malliavin operator is defined as follows. Let 1j = 1[u(j ),v(j )] × E . The domain Rj = D(Lj ) is the set of functionals 8 of the form 8 = F (Yu(j ) , wt1 − wt0 , . . . , wtN − wtN −1 , (1j µ)(f1 ), . . . , (1j µ)(fn )) (5) 2 (R × E) (continuous where u(j ) = t0 ≤ t1 ≤ · · · tN ≤ v(j ), fi ∈ CK,v + 2 functions with compact support, and of class C in the v ∈ E-direction), and F ∈ C↑2 (Rd2 +Nm+n ). Clearly, Rj generates B[u(j ),v(j )] . With an auxiliary function α : E → R+ , we define Lj by (1)
(2)
Lj 8 = Lj 8 + Lj 8 , where (1) Lj 8
N
N
i=1
i=1
1X ∂ 2F 1 X ∂F = trace 2 (ti − ti−1 ) − · (wti − wti−1 ) 2 2 ∂xi ∂xi
466
S. Kusuoka, N. Yoshida
and (2)
Lj 8 =
1 X ∂F (1j µ) α1v fi + (∂v α) · ∂v fi 2 ∂xi n
i=1
+
n 1 X ∂ 2F (1j µ) α(∂v fi ) · (∂v fj ) 2 ∂xi ∂xj i,j =1
for 8 ∈ Rj having the form of (5). In this case, the reference variables are given u(j ) by Yj = Yv(j ) and Zj = (Zv(j ) , Yv(j ) ). u(j ) Put X¯ t = (Yt , Zt ), then (2) is written as ¯ X¯ − ) ∗ t + B( ¯ X¯ − ) ∗ wt + C( ¯ X¯ − ) ∗ µ˜ t , t ∈ [u(j ), v(j )], X¯ t = X¯ u(j ) + A( X¯ u(j ) = (Yu(j ) , 0) . As IV.10 Bichteler et al. [2], let us consider a process Utx defined by a stochastic differential equation corresponding to X¯ with X¯ u(j ) = x like (10-4) in [2]. Put Qxt = det(Utx ) and t0 = v(j ). Assume that there exists an open set S in Rd2 +d , S ∩ {z = 0} 6= φ, on which the mapping x 7→ E[|Qxt0 |−p ] is locally bounded for any p > 1. ¯ 0 , x) is nondegenerate uniformly in S in the wide sense. Taking a truncation Then X(t ∞ (R d2 +d ; [0, 1]) satisfying supp 9 ⊂ S, functional ψj = 9(X¯ u(j ) ) with 9 ∈ CK and I nt (supp 9)∩{z = 0} 6= φ, we can apply Theorem 2 under Condition [A30 ](i)– (ii) and the√conditions of moments, and hence obtain an asymptotic expansion of P [f (ZT / T )]. For details of this example, see [17]. 5. Geometric mixing property of diffusion processes and asymptotic expansion As seen in the previous section, the geometric mixing condition is a key to obtain asymptotic expansion for functionals of stochastic processes. For a class of symmetric diffusions, this property was proved by using the spectral gap of the compact self-adjoint operator when the elements of the semigroup are of the Hilbert-Schmidt type. See Stroock [14], also Roberts and Tweedie [11]. The aim of this section is to prove that the geometric mixing property holds true for diffusion processes that are not necessarily symmetric. In this section, we consider a d-dimensional diffusion process X2 defined as the strong solution of the following stochastic differential equation: dX(t, x) =
r X i=1
Vi (X(t, x)) ◦ dwti + V0 (X(t, x))dt
X(0, x) = x , 2
We here use the letter “X” to denote a diffusion process differently from the previous sections, where X stood for a driving process with independent increments.
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
467
where Vi ∈ CB∞ (Rd ; Rd ), V0 ∈ C ∞ (Rd ; Rd ) with ∇V0 ∈ CB∞ (Rd ; Rd ⊗ Rd ), and w = (w i ) is an r-dimensional Wiener process. We assume that [C1] Lie[V1 , . . . , Vr ](x) = Rd for all x ∈ Rd . Let r
L=
1X 2 Vi + V0 . 2 i=1
The formal adjoint L∗ of L can be written as r
1X 2 Vi + V˜0 + U0 , L = 2 ∗
i=1
where U0 ∈ CB∞ (Rd ; R) and V˜0 ∈ C ∞ (Rd ; Rd ) is a vector field with ∇ V˜0 ∈ CB∞ (Rd ; Rd ⊗ Rd ). Moreover, we assume R [C2] there exists a function ρ ∈ CB∞ (Rd ; R) such that ρ > 0, Rd ρ(x)dx = 1 and lim sup ρ −1 (x)L∗ ρ(x) < 0 . |x|→∞
Let Pt denote the semigroup associated with the operator L. We then have the following theorem: Theorem 3. Suppose that Conditions [C1] and [C2] hold. Then (1) there exists a unique invariant probability measure µ on Rd corresponding to Pt . (2) µ has a C ∞ -density with respect to the Lebesgue measure, and sup ρ(x)−1
x∈Rd
dµ (x) < ∞ . dx
(3) There exist positive constants λ and C such that Z kPt f −
Rd
f dµkL1 (ρdx) ≤ Ce−λt kf kL1 (ρdx)
for all f ∈ CB (Rd ; R) . We are now on the point of combining Theorem 3 with Theorem 2. For a diffusion process Xt satisfying dXt =
r X i=1
Vi (Xt ) ◦ dwti + V0 (Xt )dt ,
(6)
468
S. Kusuoka, N. Yoshida
let Zt be defined by Zt = Z0 +
r Z X i=1
0
t
Vi0 (Xs ) ◦ dwsi +
Z 0
t
V00 (Xs )ds ,
where Z0 is σ [X0 ]-measurable with Z0 ∈ ∩p>1 Lp (P ) and E[Z0 ] = 0, Vi0 ∈ 0 0 C↑∞ (Rd ; Rd ) and V00 ∈ C↑∞ (Rd ; Rd ). Moreover, the flow Z(t, 0) is defined by the same equation corresponding to X(t, x). As in Example 2, define the extended dif¯ x) by X(t, ¯ x) = (X(t, x), Z(t, 0)), then it has a representation: fusion process X(t, ¯ x) = d X(t,
r X i=1
¯ x)) ◦ dwti + V¯0 (X(t, ¯ x))dt . V¯i (X(t,
Among several possible sufficient conditions for regularity, the H¨ormander condition for the extended process X¯ is a practical convenience. For vector fields V0 , V1 , . . . , Vr , let 60 = {V1 , . . . , Vr } and 6n = {[Vα , V ]; V ∈ 6n−1 , α = 0, 1, . . . , r} for n ∈ N. Moreover, Lie[V0 ; V1 , . . . , Vr ] denotes the linear manifold spanned by ∪∞ n=0 6n . The next theorem uses the following condition: [C3] There exists an x ∈ Rd such that 0
Lie[V¯0 ; V¯1 , . . . , V¯r ](x, 0) = Rd+d . By using the relation between the H¨ormander condition and the regularity of distributions (cf. Kusuoka-Stroock [10]), we obtain the following theorem. Theorem 4. Let Xt be a stationary diffusion process satisfying the stochastic differential equation (6). Assume Conditions [A1] with BIX for ‘BIY ’ (or [C1], [C2] ), [C3] at an x in the support of the invariant measure and [A2]. Then the asymptotic expansion given in Theorem 1 is valid if d is replaced by d 0 . 6. Expansion for functionals admitting a stochastic expansion Estimators for unknown parameter appearing in the statistical inference are not in general a normalized additive functional itself but have a stochastic expansion with the principal part being a normalized additive functional and the higher parts written as functions of the first term and other functionals. When we consider the maximum likelihood estimator, the Bayes estimator, etc., the higher-order terms are a polynomial of normalized additive functionals, while other estimators such as U-statistics need another development of the asymptotic theory. Thus, by the Deltamethod if necessary, we may without loss of generality consider the expansion corresponding to a sequence of random variables ST defined by (0) ST = Z¯ T + d (0)
k X i=1
i
(0) (1) T − 2 Qi (Z¯ T , Z¯ T ) , (j )
(j )
where Qi are R -valued polynomials, Z¯ T = T −1/2 ZT , j = 0, 1, and ZT := (0) (1) (ZT , ZT ) is a d = d (0) + d (1) -dimensional additive functional satisfying the
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
469
measurability condition stated in Section 2 for the processes X and Y . Moreover we assume that there exists a finite regressor Yj for each interval Jj = [v(j ) − , v(j )] as Theorem 2. The coefficients of Qi may depend on T if they are bounded. Theorem 5. Let M, γ , K > 0. Suppose that Conditions [A1], [A2] and [A30 ] hold. (0) Then for any K ∈ N, there exist smooth functions qj,k,T : Rd → R such that (0) q0,k,T = φ(·; 0, Cov(Z¯ T )) and that for some b > 0 and B > 0, |qj,k,T (y (0) )| ≤ Be−b|y
(0) |2
,
and there exist constants δ > 0 and c > 0 such that Z k X (k) (0) −j/2 (0) (0) −K P [f (ST )] − f (y ) T q (y )dy j,k,T ≤ cω(f, T ) + T (0) d R j =0 for any f ∈ E(M, γ ), where (k) is a sequence of constants independent of f with 1 (k) = o(T − 2 (k+δ)∧K ). Remark 2. Sakamoto and Yoshida [12] gave expression to qj,2,T , j = 0, 1, 2: Z pT ,2 (y)dy (1) , q0,2,T (y (0) ) = q R Z (0) pT ,1 (y)Qa1 (y)dy (1) , q1,2,T (y ) = −∂a Rq Z (0) pT ,0 (y)Qa2 (y)dy (1) q2,2,T (y ) = −∂a Rq Z 1 + ∂a ∂b pT ,0 (y)Qa1 (y)Qb1 (y)dy (1) . (7) q 2 R Here y = (y (0) , y (1) ), p = d (0) , q = d (1) and functions pT ,j , j = 0, 1, 2, are defined, with the summation convention, by: pT ,0 (z) = φ(z; 0, 6T ), 1 pT ,1 (z) = φ(z; 0, 6T ) 1 + λαβγ hαβγ (z; 6T ) , 6 pT ,2 (z) = pT ,1 (z) + φ(z; 0, 6T ) λαβγ δ λαβγ λδσ hαβγ δ (z; 6T ) + hαβγ δσ (z; 6T ) , 24 72 where 6T = Cov(Z¯ T ) and the Hermite polynomials hα1 ···αk (z; 6T ) are defined by hα1 ···αk (z; 6T ) = (−1)k φ(z; 0, 6T )−1 ∂α1 · · · ∂αk φ(z; 0, 6T ) , and λα1 ···αk denotes (α1 · · · αk )-cumulant of Z¯ T . Moreover, it is possible to show that Formulas (7) are valid even when there is a linear relation between the ancillary elements Z (1) if one interprets Formulas (7) with Schwartz distribution theory; thus it extends Theorem 5. Such extension is necessary when we treat the maximum
470
S. Kusuoka, N. Yoshida
likelihood estimator in the context of the M-estimator, cf. Sakamoto and Yoshida [13]. In [12], they also directly obtained the third order expansion formula for the maximum likelihood estimator for a diffusion process. Remark 3. The approach adopted in this paper is the “local approach”, which uses the Malliavin calculus over short time intervals. Contrarily, it is also possible to take the “global approach”, which applies the Malliavin calculus directly to functionals defined over a global time interval. The advantage of the global approach was that it can apply in various situations with or without mixing condition or Markovian property; examples are in [15, 16]. However, if those conditions are assumed, the present “local approach” provides a more effective way to the solution and reduces conditions such as the strong contractivity condition as [16]. 7. Proofs Lemma 1. Suppose that Condition [A1] holds true. Then there exists a positive constant a such that [A10 ]
kPB[s−,s] [f ] − P [f ]kL1 (P ) ≤ a −1 e−a(t−s) kf k∞
for any s, t ∈ R+ , s ≤ t, and any f ∈ BB[t,∞) . Proof. Since X has independent increments, when ≤ u ≤ v, PB[0,u] [C] ∈ Y for every C ∈ BB[v,∞) . In particular, for t ∈ [, ∞) and C ∈ BB[t,∞) , BB[u−,u] Y such that kC 0 k∞ ≤ kCk∞ and that there exists a measurable C 0 ∈ BB[t−,t] C 0 = PB[0,t] [C] a.s. Let ≤ s ≤ t − . Then, in the same fashion, we see Y ; hence PB[0,s] [C 0 ] = PBY [C 0 ] a.s., and it equals that PB[0,s] [C 0 ] ∈ BB[s−,s] [s−,s]
PB[s−,s] [C 0 ]. Therefore, by using Condition [A1], we obtain
−1 a−a(t−s)
PB e kCk∞ . [s−,s] [C] − P [C] L1 (P ) ≤ a
t u
As stated in Remark 3, the approach taken here is the “local approach”. To reduce the estimate of the characteristic function of T −1/2 ZT into those over short time intervals, we will later use the following lemma. Lemma 2. Let (, F, P ) be a probability space, and {BI ; I ⊂ R+ } an increasing family of sub σ -fields of F, i.e., BI ⊂ BJ if I ⊂ J . (1) Let u ≥ . Suppose that 0 [g] PB[0,u] [g] = PB[u−,u]
0 is a sub σ -field of B[u−,u] . Then for f ∈ for any g ∈ BB[u,∞) , where B[u−,u] BB[0,u] and g ∈ BB[u,∞) , 0 PB[u−,u] [fg] = PB[u−,u] [f ] · PB[u−,u] [g] .
In particular, 0 0 0 [fg] = PB[u−,u] [f ] · PB[u−,u] [g] . PB[u−,u]
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
471
(2) Let ≤ u ≤ v − ,and let I = [u − , u] and J = [v − , v]. Suppose that 0 0 ⊂ B[u−,u] and B[v−,v] ⊂ B[v−,v] are sub σ -fields, and that B[u−,u] 0 [g 0 ] PB[0,s] [g 0 ] = PB[s−,s]
for all g 0 ∈ BB[s,∞) , s = u, v. Then, for f ∈ BB[0,u] , g ∈ BB[u,v] and h ∈ BB[v,∞) , a) PBI0 ∨BJ0 [f ] = PBI0 [f ] and PBI0 ∨BJ0 [h] = PBJ0 [h]; b) PBI0 ∨BJ0 [f h] = PBI0 ∨BJ0 [f ]PBI0 ∨BJ0 [h] = PBI0 [f ]PBJ0 [h]; i h c) PB[0,u] ∨B[v,∞) [g] = PB[0,u] ∨B[v,∞) PBI0 ∨BJ0 [g] , or equivalently P [f gh] = P [f PBI0 ∨BJ0 [g]h]. 0 [g]. The operator Proof. (1) By assumption, one has PB[0,u] [f g] = f PB[u−,u] PB[u−,u] yields the result.
(2) For simplicity, we will use PI for PBI0 , PJ for PBJ0 and PI ∨J for PBI0 ∨BJ0 , respectively. (a) As for the second part, PI ∨J [h] = PI ∨J PB[0,v] [h] = PJ [h]. Next, for all i ∈ BBI0 and j ∈ BBJ0 , (1) implies that PI [f ij ] = iPI [f ]PI [j ] = PI [ij PI [f ]], and hence that P [ijf ] = P [ij PI [f ]], and we obtained the first part. (b) For i, j given above, P [ij PI ∨J [f ] PI ∨J [h]] = P [PI ∨J [PI ∨J [f ]hij ]] = P [ij PI [f ]h] = P [PI [f i]hj ] = P [PI [f i]PI [hj ]] = P [PI [f ihj ]] (by (1)) = P [ijf h] . Consequently, we have the desired result. (c) By assumption, we see that PI [gh] = PI [gPB[0,v] [h]] = PI [gPJ [h]] = PI [PI ∨J [g]PJ [h]]. This together with (1) and (2b) implies that P [fgh] = P [PI [f ]PI [gh]] = P [PI [f ]PI ∨J [g]PJ [h]] = P [PI ∨J [f h]PI ∨J [g]] = P [f hPI ∨J [g]] = P f hPB[0,u] ∨B[v,∞) [PI ∨J [g]] , which completes the proof.
t u
472
S. Kusuoka, N. Yoshida
Proof of Theorem 1. Let F denote any functional taking the form of (4). Let Bj = BIj ∨ BJj , and gj = PBj [eiu·Zj ψj ]. We see that k0Lj (F, F )k1 ≤ 2kF k2 kLj F k2 = 0 and hence |0Lj (F, Zjk )| ≤ 0Lj (F, F )1/2 0Lj (Zjk , Zjk )1/2 L
j = 0; therefore, 0Lj (F, Zjk ) = 0. When S1 [ψj ; Zj ] ⊂ D2,∞− ,
Z
iuk P [eiu·Zj ψj F ] = P [eiu·Zj Jk j (ψj F )] , where Z
Jk j (ψj F ) = −
d n X k 0 =1
Z
j k 21−1 Zj ψj F σ[k,k 0 ] Lj Zj
0
o Zj k0 +F 0Lj 1−1 ψ σ , Z 0 j j Zj [k,k ]
= Gj,k F (say) . Therefore, |u||P [gj F ]| ≤
d X k=1
kGj,k F kL1 (,B[u(j )−,v(j )] ,P ) ;
since the family of F ’s is dense in Lp (, Bj , P ), p > 1, kgj kq ≤ |u|−1
d X
kGj,k kq
k=1
for any q > 1. Choose a smooth function φ : Rd → [0, 1] so that φ(x) = 1 if |x| ≤ 1/2, and φ(x) = 0 if |x| ≥ 1. Let β be a positive constant with β < 1/2. Define a functional 9j depending on T by 9j = ψj φ(Zj /T β ), and let Zj∗ = Zj φ(Zj /(2T β )) − P [Zj φ(Zj /(2T β ))]. Since ∪j,T S1,j is bounded in Lp (P ), p > 1, we obtain for gj = PBj [eiu·Zj 9j ], h h h i i h i i ∗ ∗ sup P PBj eiu·Zj ≤ sup P gj + sup P PBj eiu·Zj 1 − 9j j
j
≤ C|u|
j
−1
j
−1
≤ C|u| ≤ C|u|
+ sup k1 − 9j k1
−1
+ sup k1 − ψj k1 + sup P [|Zj | > T β /2] j
j
+ 1 − inf P [ψj ] + CT −β , j,T
where C is a constant independent of u and T . Consequently, h h i i ∗ sup P PBj eiu·Zj ≤ c j =1,···,n(T ) T ≥T0
for |u| ≥ b, where c < 1, b > 0 and T0 > 0 are some constants.
(8)
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
473
Fix e0 > 0 arbitrarily, and let 0 < ν1 < e0 ∧ 1. By assumption, we can find j1 , j2 , . . . , jn0 ∈ {1, 2, . . . , n(T )} such that for large T , v(jl ) + T ν1 ≤ u(jl+1 ) for l = 1, 2, . . . , n0 , and that n0 ≥ BT 1−ν1 , where B is a positive constant depending only on τ , lim inf T →∞ n(T )/T and τ1 := supj,T {v(j ) − u(j )}. Indeed, put j1 = 1 and ji = min{j ; u(j ) ≥ v(ji−1 ) + T ν1 } as far as it can be defined. Then one has n0 (T ν1 + 2τ1 ) ≥ n(T )τ , which yields that for some B > 0 and large T , n0 ≥ BT 1−ν1 . Divide each one of the intervals [0, u(j1 )], [v(j1 ), u(j2 )], . . . , [v(jn0 ), T ] into subintervals with length τ except for the last interval with length at most τ , and call them I0,1 , . . . , I0,k0 ; I1,1 , . . . , I1,k1 ; . . .; In0 ,1 , . . . , In0 ,kn0 . Let ZI∗l,k = ZIl,k φ(ZIl,k /(2T β )) − P [ZIl,k φ(ZIl,k /(2T β ))] , where ZI denotes Zts for interval I = [s, t]. Put I0,0 = [0] and define ZI∗0,0 similarly for ZI0,0 = Z0 . Line up the intervals Il,k and [u(jl ), v(jl )], and call them T1 , T2 , . . . , TS from the left. For k ∈ Z+ , choose any k numbers s1 , . . . , sk from C := {1, 2, . . . , S} with replacement. Let C1 = {n ∈ C : Tn = [u(jl ), v(jl )] for some jl and Tn 6∈ {Ts1 , . . . , Tsk }} , and let Bl0 = B[min Tl −,min Tl ] ∨ B[max Tl −,max Tl ] . We will estimate ∗(i1 )
E[ZTs
∗(i )
˜∗
· · · ZTs k eiu·ZT ] , k
1
√ P where Z˜ T∗ = ZT∗ / T , ZT∗ = Ss=1 ZT∗s , and i1 , . . . , ik ∈ {1, . . . , d}. For B[min Tl −,max Tl ] -measurable random variables Al , l ∈ C1 = {l1 , . . . , l#C1 }, with kAl k∞ ≤ 1, we see from Lemma 1 that |P [5l≤li Al ]5l≥li+1 P [Al ] − P [5l≤li−1 Al ]5l≥li P [Al ]| = |P [5l≤li−1 Al {PB[0,max Tl ] [Ali ] − P [Ali ]}]5l≥li+1 P [Al ]| ≤ kPB[max Tl ≤a
i−1
i−1
−,max Tl
−1 −a(T ν1 −)
i−1
]
[Ali ] − P [Ali ]k1
e
for large T . It follows from Lemma 2 that for some a > 0, h ˜∗ iu·Z˜ ∗ ∗(i ) ∗(i ) ∗(i ) ∗(i ) P [ZTs 1 · · · ZTs k eiu·ZT ] = P ZTs 1 · · · ZTs k 5l∈C−C1 e Tl k k 1 1 h ii √ iu·Z˜ ∗ × 5l∈C1 PBl0 e Tl Z˜ T∗l = ZT∗l / T iu·Z˜ ∗
≤ 4k T kβ P [5l∈C1 |PBl0 [e Tl ]|] n o ν1 iu·Z˜ ∗ ≤ 4k T kβ 5l∈C1 P [|PBl0 [e Tl ]|] + n0 a −1 e−a(T −) 0
≤ 4k T kβ (max{e−b0 |u| /T , c})n −k ν1 +4k T kβ n0 a −1 e−a(T −) 2
≤ δ −1 cT
(1−ν1 )/2
+ δ −1 e−T
e0 −ν1
+ δ −1 e−T
δ
474
S. Kusuoka, N. Yoshida 0
if |u| > T e and T > T0 , where b0 , T0 and δ are some positive values. Here, in the third inequality, we used Petrov’s lemma (Lemma (3.2) of G¨otze and Hipp [7]). Put HT (u) = P [exp(iu · T −1/2 ZT∗ )]. From the above inequality, it follows that for every positive c1 , c2 , e and E, there exists a positive constant δ such that |D α HT (u)| ≤ δ −1 e−T
δ
(9)
for u ∈ Rd , c1 T e ≤ |u| ≤ c2 T E , and α ∈ Zd+ , |α| ≤ k, where D α = D1α1 · · · Ddαd , Di = ∂/∂ui , with α = (α1 , . . . , αd ). Thus the validity of the asymptotic expansion follows from a continuous version of Theorem 2.8 of G¨otze and Hipp [7]. In fact, their Condition (2.3) can be immediately checked, and Lemma 1 implies Condition (2.4): for any e ∈ B[0,s] and f ∈ B[t,∞) with kek∞ ≤ 1 and kf k∞ ≤ 1, |P [ef ] − P [e]P [f ]| = P ePB[0,s] [f − P [f ]]
≤ PB[0,s] [f − P [f ]] L1 (P )
= PB[s−,s] [f − P [f ]] L1 (P ) (the proof of Lemma 1) ≤ a −1 exp (−a (t − s)) . Therefore, it is possible to obtain the same estimate as Lemma (3.33) of G¨otze and Hipp [7]. Instead of Conditions (2.5) and (2.6) in G¨otze and Hipp [7], under the present assumptions, we have already had the estimate (9) corresponding to Lemma (3.43) of G¨otze and Hipp [7]. We then obtain the desired result as they did so from Lemmas (3.33) and (3.43). Jensen [9] gave a good exposition of G¨otze and Hipp’s work. t u L
j , in particular, 1−1 Proof of Theorem 2. We assume that S1 [ψj ; Zj ] ⊂ D2,∞− Zj ψj
L
j ∈ D2,∞− , 1Zj = det σZj . The matrix (γYmn ) denotes the inverse matrix of σYj . j
L
L
j j and assume that φ det σY−1j ∈ D2,∞− . Then Let φ ∈ D2,∞−
φ0Lj (Zj,l , F ) =
Mj X m,n=1
φ0Lj (Zj,l , Yj,m )γYmn 0Lj (Yj,n , F ) j
(10)
for functionals F taking the form of F = f (Xuk − Xuk−1 , Yuk : 1 ≤ k ≤ m1 )g(Yj ) for f ∈ CB∞ (R(d1 +d2 )m1 ) and g ∈ CB∞ (RMj ). The integration-by-parts formula yields X p
pq
iup P [eiu·Zj σZj φF ] = P [eiu·Zj {−2φF Lj Zj,q − 0Lj (φF, Zj,q )}] . (11)
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
475
L
j Since for A, B, C ∈ D2,∞− ,
P [A0Lj (B, C)] = P [{−0Lj (A, B) − 2ALj B}C] , we obtain
X p
iup P [eiu·Zj 0Lj (Yj,k , Zj,p )γYklj 0Lj (Yj,l , Zj,q )φF ]
= P [0Lj (Yj,k , eiu·Zj )γYklj 0Lj (Yj,l , Zj,q )φF ] = P [eiu·Zj {−0Lj (Yj,k , γYklj 0Lj (Yj,l , Zj,q )φF ) −2γYklj 0Lj (Yj,l , Zj,q )φF Lj Yj,k }] . On {φ > 0}, define σ¯ Zj by pq
pq
σ¯ Zj = σZj −
X k,l
(12)
0Lj (Yj,k , Zj,p )γYklj 0Lj (Yj,l , Zj,q ) .
It follows from (11), (12) and (10) that X pq q iup P [eiu·Zj σ¯ Zj φF ] = P [eiu·Zj 9j (φ)F ] ,
(13)
p
where q
9j (φ) =
X k,l
0Lj (Yj,k , γYklj 0Lj (Yj,l , Zj,q )φ) − 0Lj (φ, Zj,q )
−2φLj Zj,q + 2
X k,l
γYklj 0(Yj,l , Zj,q )φLj Yj,k . L
j −1 = det σYj · det σZ , φ 0 := (det σ¯ Zj )−1 σ¯ j,[q,s] ψj ∈ D2,∞− , where Since det σ¯ Z−1 j j
L
j . Substituting φ 0 into σ¯ j,[q,s] is the (q, s)-cofactor of σ¯ Zj , and φ 0 det σY−1j ∈ D2,∞− φ of (13), and summing up, we obtain
iup P [eiu·Zj ψj F ] = P [eiu·Zj Gj,p F ] , where Gj,p =
X q
Taking gj = PBI
j
∨BJ0
j
[eiu·Zj 9
q
9j ((det σ¯ Zj )−1 σ¯ j,[q,p] ψj ) .
j]
and with the help of Lemma 2, it is possible to t u
obtain the result in the same fashion as Theorem 1. Remark 4. For Lj , define another operator (Lj , D(Lj )) by D(Lj ) = {F : F ∈ D(Lj ), 0Lj (F, F ) ∈ D(Lj )} and
Mj Mj X X 1 0Lj 0Lj Yj,k , F γYklj , Yj,l − 0Lj Yj,k , F γYklj Lj Yj,l . Lj F = Lj F− 2 k,l=1
k,l=1
476
S. Kusuoka, N. Yoshida
Suppose that Yj is nondegenerate and Yj,l , σYklj ∈ D(Lj ). Then (Lj , D(Lj )) is a Malliavin operator if D(Lj ) generates B[u(j )−,v(j )] . It is then also possible to obtain the same result as in Theorem 1 as a corollary of it. Proof of Theorem 3. Step 1. Define a stochastic flow X ∗ (t, x) by the stochastic differential equation ∗
dX (t, x) =
r X i=1
Vi (X ∗ (t, x)) ◦ dwti + V˜0 (X∗ (t, x))dt
X ∗ (0, x) = x . Under Condition [C1], X ∗ (s, x) is nondegenerate uniformly in every compact set in (0, ∞) × Rd . It follows from Condition [C1] that there exists p ∈ C ∞ (0, ∞) × Rd × Rd such that Z p(t, x, y)f (y)dy Pt f (x) = Rd
for f ∈ CB (Rd ; R). Put
Z t U0 X∗ (s, x) ds f (X∗ (t, x)) Pt∗ f (x) = E exp 0
for f ∈ CB (Rd ; R). Then, Feynman-Kac formula says that for ut (x) = Pt∗ f (x), ∂ut = L∗ ut , ∂t
u0 = f .
Since for f, g ∈ CK (Rd ; R), with ut = Pt∗ f and vt = Pt g, Z Z d (ut vT −t )dx = (vT −t L∗ ut − ut LvT −t )dx = 0 , dt (Pt∗ f, g)L2 (dx) = (f, Pt g)L2 (dx) , and hence we see that Z p(t, y, x)f (y)dy Pt∗ f (x) = Rd
for f ∈ CB (Rd ; R). Step 2. Let
! 1 −1 ∗ = − lim sup ρ L ρ (x) ∧ 1 4 |x|→∞
and let U1 (x) = −ρ −1 (x)L∗ ρ(x). Then there exists R > 0 such that U1 (x) ≥ 2 if |x| > R, and (L∗ + U1 )ρ = 0. The Feynman-Kac formula again yields Z t ∗ ∗ . (14) ρ(x) = E exp (U0 + U1 ) X (s, x) ds ρ X (t, x) 0
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
477
∞ (R d ) satisfying that 0 ≤ ϕ ≤ 1 and that ϕ(x) = 1 if Take a function ϕ ∈ CK ∞ (R d ), then |x| ≤ R. Let U2 (x) = ϕ(x)(U1 (x) − 2) ∈ CK
U1 (x) = 2 + (1 − ϕ(x))(U1 (x) − 2) + U2 (x) ≥ 2 + U2 (x)
(15)
for x ∈ Rd , |x| > R. Therefore, it follows from (14) and (15) that Z t ∗ ∗ X x) ds ρ X x) . (16) + U ρ(x) ≥ E exp 2t + (U0 (s, (s, 2) 0
Define Qt : CB (Rd ; R) → CB (Rd ; R) by Z t h i Qt f (x) = ρ(x)−1 E exp (U0 + U2 ) X ∗ (s, x) ds (ρf ) X∗ (t, x) . 0
(17) Then Qt ≥ 0 and Qt 1 ≤ e−2t
(18)
for t ∈ R+ . Put Ts = ρ −1 Ps∗ ρ, s ∈ R+ . We know that j
sup |∇x p(s, y, x)| < ∞
(19)
s∈[s0 ,s1 ] x∈C y∈Rd
Rs
for any 0 < s0 < s1 and compact C ⊂ Rd . [Let Gsx = exp( Under [C1], for any k ∈ N, p(s, y, x) can be expressed as
0
U0 (X ∗ (u, x))du).
p(s, y, x) = E[Gsx δy (X ∗ (s, x))] = E[hk,y (X∗ (s, x))9k (Gsx ; X∗ (s, x))] , where hk,y : Rd → R ∈ CBk (Rd ) with uniformly (in y ∈ Rd ) bounded derivatives up to k-th order, and 9k (Gsx ; X∗ (s, x)) are certain Lp (P )-bounded uniformly in (s, x) over every compact set in (0, ∞) × Rd . Let S = [s0 , s1 ] × Rd × C. It is easy j to show that sup(s,y,x)∈S |y|i |∇y ∇xl ∂sm p(s, y, x)| < ∞ by using ∂s p(s, y, x) = Ly p(s, y, x). ] Consequently, for every bounded set B ⊂ CB (Rd ), f ∈ Ts (B) are equi-continuous on each compact set, and hence Cs := U2 ρ −1 Ps∗ ρ is compact ∞ (R d ). Moreover, C (B) is bounded in C ∞ (R d ) equipped with since U2 ρ −1 ∈ CK s B j seminorms k∇ f k∞ , and suppf ⊂ suppU2 for all f ∈ Cs (B). Clearly, kTs kop ≤ exp(s(kU2 k∞ −2)), and kCs kop ≤ kU2 k∞ exp(s(kU2 k∞ − 2)), where k · kop is the operator norm on L(CB (Rd ) → CB (Rd )). Again with (19), we see that C : (0, ∞) → L(CB (Rd ) → CB (Rd )) is continuous with respect to k · kop .
478
S. Kusuoka, N. Yoshida
Let B 0 be any bounded set in CB2 (Rd ) such that suppf ⊂ supp U2 for all f ∈ B 0 . Then, for f ∈ B 0 and 0 < s < t, Z −1 kTs f − Tt f k∞ = kρ (p(s, y, ·) − p(t, y, ·)) ρ(y)f (y)dyk∞ Z Z t ∂p = kρ −1 du (u, y, ·)ρ(y)f (y)dyk∞ ∂u s Z Z t = kρ −1 du p(u, y, ·)L∗ (ρf )(y)dyk∞ s Z t ≤ dukTu (L∗ (ρf )/ρ)k∞ s
≤ (t − s) exp(tkU2 k∞ )kL∗ (ρf )/ρk∞ . Therefore, sup kTs f − Tt f k∞ ≤ CB 0 exp(tkU2 k∞ )(t − s) .
f ∈B 0
(20)
For n ∈ N and s0 , s1 , . . . , sn ∈ (0, ∞), define Kn (s0 , s1 , . . . , sn ) by Kn (s0 , s1 , . . . , sn ) = Tsn Csn−1 · · · Cs0 . Then kKn (s0 , s1 , . . . , sn )kop ≤ c1n e(n+1)c2 max{s0 ,s1 ,...,sn } for some constants c1 , c2 , and the continuity of Cs and (20) implies that Kn : (0, ∞)n+1 → L(CB (Rd ) → CB (Rd )) is continuous with respect to the operator norm k · kop . The Riemann sum approximation shows that Z s0 Z sn−1 Z s1 ds1 ds2 . . . dsn Kn (s0 − s1 , s1 − s2 , · · · , sn−1 − sn , sn ) K˜ n (s0 ) := 0
0
0
is a compact operator from CB (Rd ) into CB (Rd ), and that kK˜ n (t)kop ≤
c1n e(n+1)c2 t n t . n!
P d d ˜ After all, Kt := ∞ n=1 Kn (t) is a compact operator from CB (R ) into CB (R ). From the definition of Qt , we obtain Qt = Tt + Kt . In fact, for f ∈ CB (Rd ),
Z t ∞ X 1 h ∗ E exp U0 X (s, x) ds Qt f (x) = Tt f (x) + ρ (x) n! 0 n=1 n Z t i U2 X ∗ (s, x) ds (ρf ) X∗ (t, x) −1
0
= Tt f (x) +
∞ Z X n=1 0
t
Z ds1
s1
Z ds2 · · ·
0
Tsn Csn−1 −sn · · · Cs1 −s2 Ct−s1 f (x) = Tt f (x) + Kt f (x) .
0
sn −1
dsn
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
479
Step 3. Hereafter, the operators Qt , Tt , Kt are regarded as operators on X = CB (Rd ; C). Let T = T1 = Q1 − K1 = ρ −1 P1∗ ρ, then T is a bounded linear operator on X and kT + K1 kop ≤ e−2 . In the same way as the proof of VIII 8.2, p. 709, of Dunford-Schwartz [5], it is possible to prove Claim 1. σ (T ) ∩ {z ∈ C; |z| ≥ e− } is a finite set, and the dimension of the range R(E(z; T )) of E(z; T ) is finite if z ∈ σ (T ), |z| ≥ e− . Claim 2. σ (T ) ∩ {z ∈ C; |z| ≥ 1} = {1}. (proof) Since T n f = ρ −1 Pn∗ ρf for f ∈ CB (Rd ), Z Z (T n f )(x)ρ(x)dx = Pn∗ (ρf )(x)dx Rd Rd Z ρ(x)f (x)dx . = Rd
Let z ∈ σ (T ) ∩ {z ∈ C; |z| ≥ 1}. The subspace Xz := E(z; T )X of X is finitedimensional (Claim 1) and Xz is invariant by T ; therefore, there exists a nonzero vector f ∈ Xz for which Tf = λf for some λ ∈ C. By using the Dunford-integral representation of E(z; T ), we see that for f = E(z; T )g, 0 = (λ − T )f = (λ − T )E(z; T )g = (λ − z)E(z; T )g = (λ − z)f , and hence λ = z, after all, Tf = zf . Since Z Rd
T (|f |) ≥ |Tf | = |z||f | , Z T (|f |)(x)ρ(x)dx ≥ |z| |f (x)|ρ(x)dx d ZR T (|f |)(x)ρ(x)dx , = |z|
(21)
Rd
and hence |z| ≤ 1. If |z| = 1, then T (|f |) = |Tf |, which implies that for some constant c = cf ∈ C, f (x) = c|f (x)| for all x ∈ Rd since suppP1∗ (x, ·) = Rd . zc|f | = zf = Tf = cT (|f |) = c|Tf | = c|f | ; therefore z = 1. Thus we obtain σ (T ) ∩ {z ∈ C : |z| ≥ 1} ⊂ {1} . If σ (T )∩{z ∈ C; |z| ≥ 1} were void, because of Claim 1, σ (T ) ⊂ {z ∈ C; |z| ≤ r} for some r < 1, and 1/n lim sup kT n kop ≤ r . n→∞
In particular,
kT n 1k∞ Z
Rd
→ 0 as n → ∞. On the other hand, Z (T n 1)(x)ρ(x)dx = ρ(x)dx = 1 ,
which is a contradiction.
Rd
480
S. Kusuoka, N. Yoshida
Claim 3. The dimension of X1 is one. (proof) Since X1 is finite dimensional, it follows from the equivalence of norms that there exists a constant C such that kf k∞ ≤ Ckf kL1 (ρdx) , f ∈ X1 . As kT n f k∞ ≤ CkT n |f |kL1 (ρdx) = Ck|f |kL1 (ρdx) for any f ∈ X1 and n ∈ N. Thus we see that lim
n→∞
1 n kT f k∞ = 0 . n
From Theorem 3, VIII. 8, Dunford-Schwartz [5] p. 711 (or the proof of it), and Claim 2, the spectral point 1 is a simple pole. Moreover, Theorem 18, VII. 3, Dunford-Schwartz [5] p. 573, yields that Tf = f for all f ∈ X1 . Clearly, for any f ∈ X1 , kf kL1 (ρdx) = kT n |f |kL1 (ρdx) ≥ kT n f kL1 (ρdx) = kf kL1 (ρdx) , and so T n |f | = |T n f |, therefore, f = cf |f | for some constant cf ∈ C. This implies that dim(X1 ) = 1. Indeed, for any f, g ∈ X1 and any x, y ∈ Rd , (f (x) + ug(x))/(f (y) + ug(y)) must be positive for any u ∈ R, but this means that f (x)g(y) − f (y)g(x) = 0. Step 4. Now we return to the proof of Theorem 3. In view of R the last part of the proof of Claim 3, there exists u ∈ CB (Rd ) such that u > 0, Rd u(x)ρ(x)dx = 1 ˜ = {x ∈ X; E1 x = 0}, and T˜ = T | ˜ ; note that and T u = u. Let E1 = E(1; T ), X X ˜ ˜ ˜ T X ⊂ X. Put G(ζ ) = ζ (1 − g(ζ )), where g(ζ ) is an analytic function near σ (T ), and it equals one near 1 and zero otherwise. By the spectral mapping theorem, σ (T (1 − E1 )) = G(σ (T )). Since G(1) = 0, G(σ (T )) ⊂ {ζ ; |ζ | < r} for some ˜ T˜ n f = T n f = (T (1 − E1 ))n f , it holds that r < 1. From the fact that for f ∈ X, 1/n lim sup kT˜ n kop < r . n→∞
For f, g ∈ CB (Rd ), Z Z (P g)(x)f (x)ρ(x)dx − E (f ) g(x)ρ(x)u(x)dx n 1 d d R R Z Z n (T f )(x)g(x)ρ(x)dx − E1 (f ) g(x)ρ(x)u(x)dx = Rd Rd Z n |g(x)ρ(x)|dx . ≤ kT − E1 kop kf k∞ Rd
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
481
Define a probability measure µ by µ(dx) = ρ(x)u(x)dx. With T n − E1 = T n (1 − E1 ), we obtain an estimate Z Z (Pn g)(x)f (x)ρ(x)dx − E1 (f ) g(x)µ(dx) Rd
Rd
n
≤ Cr kf k∞ kgkL1 (ρdx) . In particular, substituting g = 1 and taking limit, we have Z f (x)ρ(x)dx . E1 (f ) = Rd
Hence, it follows from the duality that
Z
Pn g − g(x)µ(dx)
d
L1 (ρdx)
R
≤ Cr n kgkL1 (ρdx)
(22)
for g ∈ CB (Rd ). For any t ∈ R+ , Tt u = u. [In the same argument, for every positive irrational number α, there exists a uα ∈ CB (Rd ) such that Tα uα = uα . It is easy to show that uα = u by using (22) and a similar inequality and the continuity of the semigroup {Tt }. Then ut = u also follows from the continuity.] In particular, u is smooth. Finally, the semigroup property Pt = P[t] Pt−[t] completes the proof of Theorem 3. t u Proof of Theorem 4. Condition [A30 ] follows from Condition [C3] by the same argument as Example 20 ; therefore, the assertion is just a corollary of Theorem 2 under Condition [A1]. We shall verify [A1] under Conditions [C1] and [C2] together with the stationarity. First, note that Theorem 3 (3) holds for any bounded measurable X . (The symbol ‘X’ here takes the place of ‘Y ’ in the function f . Let f ∈ BB[t,∞) previous sections.) Theorem 3 (2) says that c := supx∈Rd ρ(x)−1 dµ(x)/dx < ∞. X -measurable function P For B[t] BX [f ], there exists a Borel measurable function [t]
Ht : Rd → R such that PBX [f ] = Ht (Xt ) P -a.s. and kHt k∞ ≤ kf k∞ . By using [t] the stationarity of Xt and the Markov property PBX [f ] = PBX [f ] P -a.s., we [t] [0,t] see that
= PBX [Ht (Xt )] − P [Ht (Xt )] 1
PBX [f ] − P [f ] 1 [s]
L (P )
[s]
L (P )
= kPt−s Ht (Xs ) − P [Ht (Xt )]kL1 (P ) Z |Pt−s Ht (x) − µ[Ht ]| dµ(x) = Rd
≤ c kPt−s Ht − µ[Ht ]kL1 (ρdx) ≤ cCe−λ(t−s) kHt kL1 (ρdx) ≤ cCe−λ(t−s) kf k∞ . This completes the proof.
t u
482
S. Kusuoka, N. Yoshida
For convenience of reference, we will give: Proof of Theorem 5. For z = (z(0) , z(1) ), define ST (z) by ST (z) = z(0) +
k X
T −j/2 Qj (z(0) , z(1) ) .
j =1
Let MT = {z ∈ Rd ||z| < T α }. Take α > 0 sufficiently small so that for some constant C, (i)
T −j/2 |Qj |(T α , T α ) ≤ CT −α (i)
for j = 1, . . . , k, T > 1, where |Qj | denotes the polynomial with coefficients (i)
of Qj replaced by their absolute values. The Bhattacharya-Ghosh map ([1]) is defined by (0) ST (z) y = . y= z(1) y (1) Let f ∈ E(M, γ ). Applying Theorem 2 to f ◦ ST (·)1{·∈MT } and using Condition [A2], we see that, with d9T ,k (z) = pk,T dz, Z P [f (ST )] = dzf (ST (z))1{z∈MT } pk,T (z) + o(T −(k+δ)/2 ) +O(ω(1MT f ◦ ST , T −K )) ,
(23)
where δ is a positive constant and the small o-term depends on E(M, γ ). By definition, there exists a constant C1 such that z ∈ MT implies |y| ≤ C1 T α . From the non-degeneracy of the Jacobian, it is easy to see that the mapping z → y is one-to-one on MT . Consequently, the first term on the right-hand side of (23) is equal to R ∂z (y) pk,T (z(y)) (24) dyf (y (0) )1{z(y)∈MT ,|y|≤C1 T α } ∂y Put AT (z) = y(z) − z, and let z1∗ = y − AT (y), z2∗ = y − AT (y − AT (y)), z3∗ = y − AT (y − AT (y − AT (y))), . . . It is then easy to obtain |zj∗ − z(y)| ≤ T −(j +1)/2 × (a polynomial of |z(y)|) and similar estimates for the gradients. Expanding |∂zk∗ /∂y(y)|pk,T (zk∗ (y)), we have ! ∗ k X ∂zk ∗ Z −i/2 ∗ T qi,k,T (y) ∂y (y) pk,T (zk (y)) = φ(y; 0, 6T ) 1 + i=1
+T
−(k+1)/2
Rk,T (y) ,
Malliavin calculus, geometric mixing, and expansion of diffusion functionals
483
∗ where 6TZ = Cov(Z¯ T ), qi,k,T are smooth functions of at most polynomial growth order and
|Rk,T (y)| ≤ e−c0 |y| × (a polynomial of |y|) 2
for some positive constant c0 . Since for small c1 > 0, z(y) ∈ MT if |y| < c1 T α , it follows from (24) that, taking δ sufficiently small if necessary, ! Z k X (0) Z −i/2 ∗ T qi,k,T (y) + o(T −(k+δ)/2 ) P [f (ST )] = dyf (y )φ(y; 0, 6T ) 1 + i=1
−K
+O(ω(1MT f ◦ ST , T )) Z k X T −i/2 qi,k,T (y (0) ) + o(T −(k+δ)/2 ) = dy (0) f (y (0) ) i=0
+O(ω(1MT f ◦ ST , T −K )) , where qi,k,T are given by k X
T
−i/2
i=0
qi,k,T (y
(0)
Z )=
dy
(1)
φ(y; 0, 6TZ )
1+
k X i=1
! ∗ T −i/2 qi,k,T (y)
.
This completes the proof since one can estimate the last term on the right-hand side t u with ω(1MT , T −K ) and ω(f, T −K ), first taking a different 6 o if necessary. References 1. Bhattacharya, R.N., Ghosh, J.K.: On the validity of the formal Edgeworth expansion. Ann. Statist. 6, 434–451 (1976) 2. Bichteler, K., Gravereaux, J.-B., Jacod, J.: Malliavin calculus for processes with jumps. New York London Paris Montreux Tokyo: Gordon and Breach Science Publishers (1987) 3. Brockwell, P.J., Davis, R.A.: Time Series: theory and methods. Second Ed. New York Berlin Heiderberg: Springer (1991) 4. Doukhan, P.: Mixing: properties and examples. Lect. Notes in Statisit. 85, Springer (1995) 5. Dunford, N., Schwartz, J.T.: Linear operators. Part I: General theory. New York London: Wiley (1964) 6. Ghosh, J.K.: Higher order asymptotics. California: IMS (1994) 7. G¨otze, F., Hipp, C.: Asymptotic expansions for sums of weakly dependent random vectors. Z. Wahr. 64, 211–239 (1983) 8. G¨otze, F., Hipp, C.: Asymptotic distribution of statistics in time series. Ann. Statist. 22, 211–239 (1994) 9. Jensen, J. L.: Asymptotic expansions for sums of dependent variables. Memoirs, 10. Aarhus University, Institute of Mathematics, Department of Theoretical Statistics, Aarhus, 1986 10. Kusuoka, S., Stroock, D.W.: Application of the Malliavin calculus I. In: K. Itˆo (ed.), Stochastic Analysis, Proc. Taniguchi Inter. Symp. on Stochastic Analysis, Katata and Kyoto 1982. Kinokuniya/North-Holland, Tokyo, 271–306 (1984) 11. Roberts, G.O., Tweedie, R.L.: Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2(4), 341–363 (1996)
484
S. Kusuoka, N. Yoshida
12. Sakamoto, Y., Yoshida, N.: Third order asymptotic expansions for diffusion processes. Cooperative Research Report 107, 53–60, The Institute of Statistical Mathematics, Tokyo (1998) 13. Sakamoto, Y., Yoshida, N.: Higher order asymptotic expansions for a functional of a mixing process and applications to diffusion functionals. in preparation (1998) 14. Stroock, D.W.: Probability theory, an analytic view. Cambridge (1994) 15. Yoshida, N: Asymptotic expansion for small diffusions via the theory of MalliavinWatanabe. Prob. Theory Related Fields. 92, 275–311 (1992) 16. Yoshida, N.: Malliavin calculus and asymptotic expansion for martingales. Probab. Theory Relat. Fields 109, 301–342 (1997) 17. Yoshida, N.: Edgeworth expansion for diffusions with jumps. in preparation (1999)