Comput Econ (2010) 35:63–83 DOI 10.1007/s10614-009-9179-1
A Student-t Full Factor Multivariate GARCH Model K. Diamantopoulos · I. D. Vrontos
Accepted: 23 April 2009 / Published online: 10 May 2009 © Springer Science+Business Media, LLC. 2009
Abstract We extend the full-factor multivariate GARCH model of Vrontos et al. (Econom J 6:312–334, 2003a) to account for fat tails in the conditional distribution of financial returns, using a multivariate Student-t error distribution. For the new class of Student-t full factor multivariate GARCH models, we derive analytical expressions for the score, the Hessian matrix and the Information matrix. These expressions can be used within classical inferential procedures in order to obtain maximum likelihood estimates for the model parameters. This fact, combined with the parsimonious parameterization of the covariance matrix under the full factor multivariate GARCH models, enables us to apply the models in high dimensional problems. We provide implementation details and illustrations using financial time series on eight stocks of the US market. Keywords Autoregressive conditional heteroscedasticity · Fat tails · Maximum likelihood estimation · Student-t distribution 1 Introduction Time varying volatility models have received a lot of attention over the last decades. A large number of univariate models have been proposed in the literature, based on and extended the Autoregressive Conditional Heteroscedastic (ARCH) model of Engle (1982); see for example, the review article of Bollerslev et al. (1992). The reason
K. Diamantopoulos Quantos S.A, 154 Sygrou Ave, 176 71 Athens, Greece I. D. Vrontos (B) Department of Statistics, Athens University of Economics and Business, Pattision 76, 134 76 Athens, Greece e-mail:
[email protected]
123
64
K. Diamantopoulos, I. D. Vrontos
for the wide development and application of univariate GARCH-type models is that they are able to capture several stylized facts often met in financial series, such as time varying volatility, large kurtosis, heavy tailed distributions, non-linearities and extreme events. On the other hand, the development of multivariate GARCH-type models is very important in order to study the relations between the volatilities and co-volatilities of financial time series. Modelling and predicting the time varying dynamics of conditional covariances of asset returns is crucial in financial applications such as asset pricing, portfolio allocation and risk management. However, the application of multivariate models in practice is not an easy task. Firstly, the number of parameters to be estimated becomes very large as the dimension of the time series increases. Furthermore, there are positive definiteness restrictions on the covariance matrix which become more complicated (difficult to satisfy) as the dimensionality of the problem increases. Different specifications of multivariate GARCH models have been proposed (see for details, the review article of Bauwens et al. 2006) to deal with these problems, which impose different restrictions on the conditional covariance structure. For example, recently, Engle (2002) and Tse and Tsui (2002) introduced Dynamic Conditional Correlation models that generalize the Constant Conditional Correlation model of Bollerslev (1990), where the conditional correlations are allowed to change over time. Alexander (2001) proposed the Orthogonal GARCH model, where the time varying covariance matrix is generated by a small number of uncorrelated factors, van der Weide (2002), and Lanne and Saikkonen (2007) introduced generalizations of the Orthogonal GARCH model, while Vrontos et al. (2003a) proposed a full-factor multivariate GARCH model. The choice of the multivariate model is usually determined by the stylized facts (met in the time series under consideration) that need to be captured, and by practical considerations, such as ease of estimation. Estimation and inference in multivariate volatility models remains a challenging computational problem. The multivariate Normal distribution is often considered for the error process, and numerical techniques are usually used to approximate the derivatives of the log-likelihood function (score) with respect to the parameters, which are often unstable and become even more unstable when they are used to compute the Hessian; see for example, McCullough and Vinod (1999), McCullough and Renfro (2000), and Brooks et al. (2001). Fiorentini et al. (1996) provide analytical expressions for the score and the Hessian of univariate GARCH models under conditional Normality, and demonstrate that this improves the numerical accuracy of the estimates while speeding-up estimation. Similar is the conclusion of Laurent (2004) who derived analytical expressions for the score of the APARCH model. Lucchetti (2002) provides analytical results for the score of a multivariate BEKK model and Vrontos et al. (2003a) propose closed form expressions for the score and the Hessian of the full-factor multivariate GARCH model under conditional normality. Although the multivariate Normal distribution is the most commonly used in applications, there is empirical evidence that the distribution of financial time series has usually fat tails, even after taking into account the volatility clustering phenomenon. In other words, the normality assumption of standardised residuals of estimated volatility models is usually rejected in most financial applications. Consistent estimates may be obtained by using quasi maximum likelihood (Bollerslev and Wooldridge 1992)
123
A Student-t Full Factor Multivariate GARCH Model
65
even if the data generating process is not conditionally normal; this is also true for multivariate GARCH models (Jeantheau 1998; Ling and McAleer 2003; Compte and Lieberman 2003). Hafner and Herwartz (2003) provide analytical expressions for the score and the Hessian of a general multivariate GARCH model in a quasi maximum likelihood framework and show that analytical calculation of derivatives outperform numerical methods, especially in the case of a multivariate Student-t data generating process. Another alternative approach is to specify a distribution that accounts for fat tails; for example, Bollerslev (1987) used a Student-t distribution together with a GARCH specification, and Nelson (1991) employed a Generalized Error distribution and an exponential GARCH model for the analysis of the conditional variances of univariate series. Fiorentini et al. (2003) provided analytical expressions for the score of conditionally heteroskedastic dynamic regression models under the assumption of a multivariate Student-t distribution for the error process. In this study, we consider the problem of estimation and inference in full-factor multivariate GARCH models. We extend the model of Vrontos et al. (2003a) and advocate the use of a multivariate Student-t distribution to account for fat tails, and a GARCH specification to account for time-varying conditional variances. The model can be applied easily to high dimensional financial time series, since the covariance matrix is always positive definite by construction and the number of parameters is relatively small. The estimation of the parameters of the multivariate model is achieved by using classical techniques. We provide analytical expressions for the score, the Hessian and the Information matrix of the log-likelihood function for this class of full-factor multivariate GARCH models under the assumption of a Student-t distribution for the error process. The use of analytical derivatives in the estimation procedure should considerably improve the numerical accuracy of the resulting estimates. The remainder of the paper is organised as follows. The multivariate Student-t full-factor GARCH model is introduced in Sect. 2. Classical inference is presented in Sect. 3. In Sect. 4, we illustrate the above methods in a dataset of US stocks, and we conclude in Sect. 5 with a brief discussion. 2 The Student-t Full-Factor Multivariate GARCH Model In the general full-factor multivariate GARCH model with time varying variances and covariances, the N × 1 vector of observations yt is typically assumed to be generated by the following set of equations: yt = µ + ε t ε t = W Xt Xt |t−1 ∼ D N (0, t )
(1)
where µ is a N × 1 vector of constants, εt is a N × 1 innovation vector, W is N × N parameter matrix, t−1 is the information set up to time t −1, Xt is a N ×1 vector of factors with elements xi,t , i = 1, . . . , N , D N is a multivariate distributionand t is N ×N 2 ,...,σ2 diagonal variance covariance matrix, which is given by t = diag σ1,t N ,t
123
66
K. Diamantopoulos, I. D. Vrontos
while the variances follow a GARCH process 2 2 2 σi,t = αi + bi xi,t−1 + gi σi,t−1 , i = 1, . . . , N , t = 1, . . . , T 2 , i = 1, . . . , N is the variance of the ith factor at time t, α > 0, b ≥ 0, g ≥ 0, where σi,t i i i i = 1, . . . , N . According to the above model, the vector εt is a linear combination of the factors xi,t , i = 1, . . . , N . Under the assumption of a conditional multivariate Normal distribution for the vector Xt , i.e., Xt |t−1 ∼ N N (0, t ), the likelihood for model (1) for a sample of T observations y = (y1 , y2 , . . . , yT ) can be written as
l(y|θ ) = (2π )−
TN 2
T 1 |Ht |−1/2 exp − (yt − µ) Ht−1 (yt − µ) 2 t=1
while under the assumption of a conditional multivariate Student-t distribution with v degrees of freedom, the likelihood can be written as l (y|θ ) =
v
N +v
T
2
[π (v − 2)] N /2 − N +v T 2 1 −1/2 −1 |Ht | 1+ × , (yt − µ) Ht (yt − µ) v−2
2
t=1
where (.) is the gamma function. Note that the above specification of multivariate Student-t distribution is such that the error process εt has covariance matrix Ht , which can be written in the form 1/2 1/2 Ht = W t W = W t W t = L L . We take W triangular with elements wi j = 0 for j > i and wii = 1 for i = 1, . . . , N . The conditional covariance matrix Ht is always positive definite if the factor variances 2 , i = 1, . . . , N are well defined. Note also that the factors x , i = 1, . . . , N have σi,t i,t zero idiosyncratic variances so they are not parameters to be estimated but are given by Xt = W −1 ε t . Therefore, for N = 1 the model reduces to the GARCH(1,1) model. 2 are known, In the following, we will also assume for convenience that xi,0 and σi,0 bi = b and gi = g for i = 1, . . . , N . These assumptions can be relaxed without adding much difficulty in the estimation process. For example, Vrontos et al. (2000) discuss Bayesian estimation of GARCH and EGARCH models when the variance at time zero is unknown. The number of parameters to be estimated for a N −dimensional problem is 2N + 2 + N (N2−1) and 2N + 3 + N (N2−1) for a multivariate Normal and Student-t distribution, respectively, using a GARCH(1,1) specification. For example, in the case of the multivariate Student-t model, the parameter vector to be estimated is θ = (µ1 , µ2 , …, µ N , α1 , α2 , …, α N , b, g, w21 , w31 , w32 , …, w N 1 , …, w N ,N −1 ,v) .
123
A Student-t Full Factor Multivariate GARCH Model
67
3 Estimation and Inference Maximum likelihood estimates for heteroscedastic models are usually obtained by using numerical optimization algorithms such as scoring algorithm, the method proposed by Mak (1993) and developed further by Mak et al. (1997), and by Berndt et al. (1974) algorithm. Here, we compute the maximum likelihood estimates by using a quasiNewton method, the Broyden, Fletcher, Golfarb and Shanno (BFGS hereafter) algorithm; see for example, Goldfarb (1970), and Shanno (1970). The kth iteration of the algorithm takes the form ∂ LT k k−1 θ − λQ −1 θ = ∂θ
(2)
where θ is the estimate of the parameter vector obtained after k − 1 iterations, L T is the log-likelihood function, Q is some approximation of the Hessian computed at k−1 , which determines the direction of the kth iteration, λ is a scalar which deterθ mines the step-size in the direction given by Q, and can be found by performing a line k−1 LT search, and ∂∂θ is the gradient computed at θ . One of the great advantages of our model is the fact that the gradient and the Hessian matrix in (2) are available in closed forms. The motivation for using this variant of Newton algorithm in our full-factor multivariate GARCH model comes from the experimental results of researchers in GARCH models. For example, Fiorentini et al. (1996) computed the analytic conditional expected information matrix of the parameters for the GARCH model and constructed a mixed-gradient algorithm in order to accelerate the convergence of the model parameters. According to their results, the superiority of gradient algorithms, which use the estimated information matrix, is clear in early iterations. Vrontos et al. (2003a) proposed a Fisher scoring algorithm using analytic derivatives for the log-likelihood function of a multivariate Normal distribution using a GARCH specification. The BFGS algorithm has been used in the GARCH-context literature as an effective alternative to the above estimation schemes; see, for example, Lucchetti (2002), Vrontos et al. (2003b), and Kawakatsu (2006) among others. For the multivariate Student-t full-factor GARCH model, we adopt a mixed-gradient approach. First, we use the BFGS algorithm until convergence is achieved. Then, we propose using the Newton–Raphson algorithm to refine the parameter estimates and compute their standard errors. The use of the BFGS algorithm prescribes the robustness of the optimization procedure, as it can avert possible problems, such as singularity of the Q matrix or failure to convergence, when the initial parameter values are far from that of the maximum likelihood. For the full-factor multivariate GARCH model (1) the log-likelihood function, under the multivariate normal distribution, is given by k−1
N N 2 T T 1 2 TN 1 xi,t ln (2π ) − ln σi,t − L T (y|θ ) = − 2 2 2 2 σi,t t=1
i=1
t=1
i=1
123
68
K. Diamantopoulos, I. D. Vrontos
and under the multivariate Student-t distribution, by v T N N +v − T ln − L T (y|θ ) = T ln ln [π (v − 2)] 2 2 2 N 2 T T N 1 xi,t 1 2 N +v ln σi,t ln 1 + , − − 2 2 2 v−2 σi,t t=1
t=1
i=1
i=1
where xi,t , i = 1, . . . , N , are the elements of Xt = W −1 ε t . In order to avoid the positivity restrictions for the parameters, αi > 0, i = 1, . . . , N , b ≥ 0, g ≥ 0 we use the logarithmic transformation, so that αi∗ = ln (αi ), b∗ = ln (b) and g ∗ = ln (g). We also divide the parameter vector into three blocks for the multivariate Normal distribution and into four blocks for the multivariate Student-t distribution. The first block contains the parameters of the mean equation, that is θ 1 = (µ1 , µ2 , …, µ N ) , the second block contains the transformed parameters of the variance equation, that is, θ 2 = (α1∗ , α2∗ , …, α ∗N , b∗ , g ∗ ) , the third block contains the parameters in matrix W , that is, θ 3 = (w21 , w31 , w32 , …, w N 1 , …, w N ,N −1 ) and the fourth block contains the degrees of freedom θ 4 = (v). For each of these blocks, we provide analytical expressions for the score and the Hessian matrix, which are very helpfull in order to improve the numerical accuracy of the parameter estimates. After the transformation 2 of the factors x , i = 1, . . . , N , are of the positive parameters, the variances σi,t i,t given by ∗
∗
∗
2 2 2 σi,t = eαi + eb xi,t−1 + eg σi,t−1 .
2 and the Some assumptions are also required for the initial values of the variances σi,t 2 , as the variance equation is dynamic. For t = 0, σ 2 = 0, while squared factors xi,t i,0 2 the xi,0 are calculated by using a sufficient number of observations from the sample. In the sequel, we present the calculation of the analytic derivatives for the loglikelihood function of the multivariate Student-t distribution using a GARCH specification for the time-varying conditional variances. Differentiating with respect to the mean parameters θ 1 = (µ1 , µ2 , …, µ N ) yields
N T T 2 ∂ LT 1 1 ∂σi,t N + v B1 , =− − 2 ∂θ ∂θ 1 2 2 A σi,t 1 t=1 i=1 t=1 where 2 N 1 xi,t A =1+ 2 v−2 σi,t i=1
123
A Student-t Full Factor Multivariate GARCH Model
69
and ⎡ ⎤ N 2 2 x ∂σ 1 ⎢ 2xi,t ∂ xi,t i,t i,t ⎥ − 2 B1 = ⎣ 2 ⎦ v−2 ∂θ ∂θ σ 1 1 2 i,t i=1 σi,t while the first block of the Hessian matrix is given by ⎧ ⎡ ⎤⎫ ⎪ T ⎪ N 2 ∂σ 2 2σ 2 ⎬ ⎨ ∂σ ∂ 1 1 1 T ⎢ i,t i,t i,t ⎥ − = − + ⎣ 2 ⎦ 2 ∂θ ∂θ ⎪ 2 ∂θ 1 ∂θ 1 ∂θ 1 ∂θ 1 σi,t 1 1 ⎪ ⎭ t=1 ⎩ i=1 σ2 ∂2 L
i,t
N +v − 2
T t=1
C1 A − B1 B1 A2
where N 2 ∂ 2 xi,t 1 1 4xi,t ∂ xi,t ∂σi,t ∂ xi,t ∂ xi,t C1 = + 2xi,t − 2 2 2 v−2 ∂θ 1 ∂θ 1 ∂θ 1 ∂θ 1 σi,t σi,t ∂θ 1 ∂θ 1 i=1 ⎡ ⎛ ⎞⎤ N 2 2 2 2 2 2 2xi,t ∂σi,t ∂σi,t ⎟⎥ 1 ⎢ 1 ⎜ xi,t ∂ σi,t + ⎣ 2 ⎝− 2 ⎠⎦ 2 + v−2 ∂θ 1 ∂θ 1 σi,t σi,t ∂θ 1 ∂θ 1 i=1 σ2 i,t
and the derivatives of xi,t , i = 1, . . . , N , with respect to the mean parameters θ 1 are ∂x given by the rows of −W −1 matrix. That is, the ∂θ1,t1 is given by the first row of −W −1 matrix, the
∂ x2,t ∂θ 1
is given by the second row of −W −1 matrix, and so on.
The off diagonal blocks, by
2 2 ∂2 LT , ∂ L T , and ∂θ∂ L∂θT , of the Hessian matrix are given ∂θ 1 ∂θ 2 ∂θ 1 ∂θ 3 1 4
⎧ ⎡ ⎤⎫ ⎪ T ⎪ N 2 ∂σ 2 2σ 2 ⎨ ⎬ ∂σ ∂ 1 1 1 T ⎢ i,t i,t i,t ⎥ − = − + ⎣ 2 ⎦ 2 ⎪ 2 ∂θ 1 ∂θ 2 ∂θ 1 ∂θ 2 σi,t ∂θ 1 ∂θ 2 ⎪ ⎭ t=1 ⎩ i=1 σ2 ∂2 L
i,t
N +v − 2
T t=1
C12 A − B1 B12 A2
where ⎡ C12 =
1 v−2
N
⎛
⎞⎤
2 ∂σi,t ⎢ 1 ⎜ ⎣ 2 ⎝ 2 ∂θ 1 ∂θ 2 σi,t 2 i=1 σi,t 2 2xi,t
2 ∂σi,t
−
2 xi,t 2 σi,t
2 ∂ 2 σi,t
∂θ 1 ∂θ 2
−
2 2xi,t ∂ xi,t ∂σi,t 2 ∂θ ∂θ σi,t 1 2
⎟⎥ ⎠⎦
123
70
K. Diamantopoulos, I. D. Vrontos
and ⎡ ⎤ N 2 2 1 ⎢ xi,t ∂σi,t ⎥ B12 = − ⎣ 2 ⎦, v−2 ∂θ 2 2 i=1 σi,t ⎧ ⎡ ⎤⎫ ⎪ T ⎪ N 2 ∂σ 2 2σ 2 ⎬ ⎨ 2 ∂σ ∂ ∂ LT 1 1 1 ⎢ i,t i,t i,t ⎥ − = − + ⎣ 2 ⎦ 2 ∂θ ∂θ ⎪ 2 ∂θ 1 ∂θ 3 ∂θ 1 ∂θ 3 σi,t 1 3 ⎪ 2 ⎭ t=1 ⎩ i=1 σi,t T N + v C13 A − B1 B13 − 2 A2 t=1
where
C13
N 2 ∂ 2 xi,t 1 1 2xi,t ∂ xi,t ∂σi,t ∂ xi,t ∂ xi,t = + 2xi,t − 2 2 2 v−2 ∂θ 1 ∂θ 3 ∂θ 1 ∂θ 3 σi,t σi,t ∂θ 1 ∂θ 3 i=1 ⎡ ⎛ ⎞⎤ N 2 2 2 2 2 2 2 xi,t ∂ σi,t 2xi,t ∂σi,t ∂σi,t ⎟⎥ 1 ⎢ 1 ⎜ 2xi,t ∂σi,t ∂ xi,t + ⎣ 2 ⎝− 2 ⎠⎦ 2 − 2 + v−2 ∂θ 1 ∂θ 3 σi,t σi,t ∂θ 1 ∂θ 3 σi,t ∂θ 1 ∂θ 3 i=1 σ2 i,t
and ⎡ B13 =
1 v−2
⎤
N
2 xi,t
2 ∂σi,t
⎢ 2xi,t ∂ xi,t ⎥ − 2 ⎣ 2 ⎦, ∂θ ∂θ σ 3 3 2 i,t i=1 σi,t
2 T N T T N + v B1 N + v B1 xi,t ∂2 LT 1 B1 + − =− . 2 2 A 2(v − 2) A 2(v − 2)2 A2 ∂θ 1 ∂θ 4 σi,t t=1 t=1 t=1 i=1 Differentiating with respect to the variance parameters θ 2 = (α1∗ , α2∗ , …, α ∗N , b∗ , g ∗ ) yields N T T 2 ∂ LT 1 1 ∂σi,t N + v B2 =− − 2 ∂θ ∂θ 2 2 2 A σi,t 2 t=1 i=1 t=1 where ⎡ B2 = −
123
1 v−2
N i=1
⎢ ⎣
⎤ 2 xi,t 2 σi,t
2
2 ∂σi,t
∂θ 2
⎥ ⎦,
A Student-t Full Factor Multivariate GARCH Model
71
and the second diagonal block of the Hessian matrix is given by ⎧ ⎡ ⎤⎫ ⎪ T ⎪ N 2 ∂σ 2 2σ 2 ⎨ ⎬ ∂σ ∂ 1 1 1 T ⎢ i,t i,t i,t ⎥ − = − + ⎣ 2 ⎦ 2 ⎪ 2 ∂θ 2 ∂θ 2 ∂θ 2 ∂θ 2 σi,t ∂θ 2 ∂θ 2 ⎪ ⎭ t=1 ⎩ i=1 σ2 ∂2 L
i,t
N +v − 2
T t=1
C2 A − B2 B2 A2
where ⎡ C2 =
1 v−2
⎛
N
The off diagonal blocks,
∂2 LT ∂θ 2 ∂θ 3
and
⎧
∂2 L
T
∂θ 2 ∂θ 3
=−
⎞⎤
2 ∂σi,t ⎢ 1 ⎜ ⎣ 2 ⎝ 2 ∂θ 2 ∂θ 2 σi,t 2 i=1 σi,t 2 2xi,t
1 2
T ⎪ N ⎨
2 ∂σi,t
∂2 LT , ∂θ 2 ∂θ 4
−
2 xi,t 2 σi,t
2 ∂ 2 σi,t
∂θ 2 ∂θ 2
⎟⎥ ⎠⎦ .
of the Hessian matrix are given by
⎡ 2 ∂σi,t
2 ∂σi,t 1 ⎢ ⎣− 2 ⎪ ∂θ 2 ∂θ 3 2 t=1 ⎩ i=1 σi,t
⎤⎫ ⎪ ⎬ 1 ⎥ + 2 ⎦ σi,t ∂θ 2 ∂θ 3 ⎪ ⎭ 2 ∂ 2 σi,t
T N + v C23 A − B2 B23 − 2 A2 t=1
where
C23
⎡ ⎛ ⎞⎤ N 2 2 ∂σ 2 2 ∂ 2σ 2 2 x ∂σ ∂σ 2x 1 2xi,t i,t ∂ xi,t ⎟⎥ ⎢ 1 ⎜ i,t i,t i,t i,t i,t = ⎣ 2 ⎝ 2 − 2 − 2 ∂θ ∂θ ⎠⎦ v−2 ∂θ ∂θ ∂θ ∂θ σ σ σi,t 2 2 2 3 3 3 i,t i,t i=1 σ2 i,t
and ⎡ B23 =
1 v−2
N
⎤ 2 xi,t
2 ∂σi,t
⎢ 2xi,t ∂ xi,t ⎥ − 2 ⎣ 2 ⎦, ∂θ ∂θ σ 3 3 2 i,t i=1 σi,t
2 T N T T N + v B2 N + v B2 xi,t ∂2 LT 1 B2 + − =− . 2 2 A 2(v − 2) A 2(v − 2)2 A2 ∂θ 2 ∂θ 4 σi,t t=1 t=1 t=1 i=1
123
72
K. Diamantopoulos, I. D. Vrontos
Differentiating with respect to the parameters in matrix W , that is, with respect to θ 3 = (w21 , w31 , w32 , …, w N 1 , …, w N N −1 ) yields N T T 2 ∂ LT 1 1 ∂σi,t N + v B3 , =− − 2 ∂θ ∂θ 3 2 2 A σi,t 3 t=1 i=1 t=1 where ⎡ ⎤ N 2 2 x ∂σ 1 ⎢ 2xi,t ∂ xi,t i,t i,t ⎥ B3 = − 2 ⎣ 2 ⎦, v−2 ∂θ ∂θ σ 3 3 2 i,t i=1 σi,t and the third diagonal block of the Hessian matrix is given by ⎧ ⎡ ⎤⎫ ⎪ T ⎪ N 2 ∂σ 2 2σ 2 ⎨ ⎬ 2 ∂σ ∂ ∂ LT 1 1 1 ⎢ i,t i,t i,t ⎥ + 2 ⎣− 2 ⎦ =− ⎪ 2 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 σi,t ∂θ 3 ∂θ 3 ⎪ 2 ⎭ t=1 ⎩ i=1 σi,t T N + v C3 A − B3 B3 − 2 A2 t=1
where N 2 ∂ 2 xi,t 1 1 4xi,t ∂ xi,t ∂σi,t ∂ xi,t ∂ xi,t C3 = + 2xi,t − 2 2 2 v−2 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 σi,t σi,t ∂θ 3 ∂θ 3 i=1 ⎡ ⎛ ⎞⎤ N 2 2 2 2 2 2 2xi,t ∂σi,t ∂σi,t ⎟⎥ 1 ⎢ 1 ⎜ xi,t ∂ σi,t + ⎣ 2 ⎝− 2 ⎠⎦ 2 + v−2 ∂θ 3 ∂θ 3 σi,t σi,t ∂θ 3 ∂θ 3 i=1 σ2 i,t
with ∂ W −1 ∂Xt εt = −W −1 W ∂wi j ∂wi j and ∂ W −1 ∂ W −1 ∂ W −1 ∂ W −1 ∂Xt εt . = W −1 W W + W −1 W W ∂wi j ∂wkl ∂wkl ∂wi j ∂wi j ∂wkl The off diagonal block,
∂2 LT , ∂θ 3 ∂θ 4
of the Hessian matrix is given by
2 T N T T N + v B3 N + v B3 xi,t ∂2 LT 1 B3 + − =− . 2 2 A 2(v − 2) A 2(v − 2)2 A2 ∂θ 3 ∂θ 4 σi,t t=1 t=1 t=1 i=1
123
A Student-t Full Factor Multivariate GARCH Model
73
Differentiating with respect to θ 4 = v, yields ∂ LT T = ∂θ 4 2
N +v 2
T T TN 1 T v N +v − G − ln (A) + 2 2 2(v − 2) 2(v − 2)2 2
−
t=1
t=1
where (.) is the digamma function and N )
G=
i=1
2 xi,t 2 σi,t
2 , N 1 xi,t 1+ 2 v−2 σi,t i=1
while the second derivative is v ∂2 LT N +v TN T T + = − 3 3 4 2 4 2 2(v − 2)2 ∂θ 4 ∂θ 4 T T 1 2N + v + 2 N +v 2 + − G + G 2(v − 2)2 2(v − 2)3 2(v − 2)4 t=1
t=1
where 3 (.) is the trigamma function. 2 , for the first three parameter blocks, under The first and second derivatives of σi,t the GARCH assumption, for both multivariate distributions, are given by 2 ∂σi,t
∂ xi,t−1 ∗ ∂σi,t−1 + eg , i = 1, . . . , N , ∂θ 1 ∂θ 1 ∂θ 1 2 2 2 ∂ 2 σi,t ∗ ∂ σi,t−1 b∗ ∂ x i,t−1 ∂ x i,t−1 + eg , i = 1, . . . , N , = 2e ∂θ 1 ∂θ 1 ∂θ 1 ∂θ 1 ∂θ 1 ∂θ 1 2 2 ∂ 2 σi,t ∂ 2 σi,t−1 ∂ xi,t−1 b∗ g∗ 0, . . . , 0, e = 2x , 0 + e i,t−1 ∂θ 1 ∂θ 1 ∂θ 2 ∂θ 1 ∂θ 2 2 ∂σi,t−1 ∗ 0, . . . , 0, 0, eg , i = 1, . . . , N , + ∂θ 1 2 2 ∂ 2 σi,t−1 ∂ 2 σi,t ∂ 2 xi,t−1 b∗ ∂ x i,t−1 ∂ x i,t−1 g∗ b∗ , i = 1, . . . , N , = 2e +e + 2e x i,t−1 ∂θ 1 ∂θ 1 ∂θ 3 ∂θ 3 ∂θ 1 ∂θ 3 ∂θ 1 ∂θ 3 2
∗
= 2eb xi,t−1
2 ∂σi,t
∂θ 2
= ci,t + eg
∗
2 ∂σi,t−1
∂θ 2
, i = 1, . . . , N ,
where ci,t = (0, . . . ,
∗
∗
∗
2 2 eαi , eg σi,t−1 ) . , 0, . . . , 0, eb xi,t−1 *+,i-th element
123
74
K. Diamantopoulos, I. D. Vrontos
Also, it is that 2 ∂ 2 σi,t
∂θ 2 ∂θ 2
= qi,t + eg
∗
2 ∂ 2 σi,t−1
∂θ 2 ∂θ 2
2 ∂σi,t−1
+ pt
, i = 1, . . . , N ,
∂θ 2
where ⎡
q1,t
⎢ ⎢ ⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎢ ⎣
∗
eα1 0 .. .
0 0 .. .
0 0
0 0
∗ ∂σ1,t−1 eg ∂α ∗ 1
··· ··· .. . ··· ···
2
···
0
0 0 .. .
0 0 .. .
0 0 .. .
0 0
0 ∗ 2 eb x1,t−1
0 0
0
∗ ∂σ1,t−1 eg ∂b ∗ 2
... ⎡
q N ,t
0 ⎢0 ⎢ ⎢ .. ⎢. ⎢ = ⎢0 ⎢ ⎢0 ⎢ ⎣ 0
0 0 .. .
··· ··· .. .
0 0
··· ···
0
···
and
...
0 0 .. .
∗
eα N 0 eg
∗
∗ eg
2 ∂σ1,t−1 ∂g ∗
2 + σ1,t−1
⎥ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎥ ⎦
... 0 0 .. .
0 0 .. .
0
0 0
∗ eb x N2 ,t−1 2 ∗ ∂σ N ,t−1 eg ∂b ∗
∂σ N2 ,t−1 ∂α ∗N
⎤
eg
∗
∂σ N2 ,t−1 ∂g ∗
⎤
+ σ N2 ,t−1
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
∗
pt = (0, 0, . . . , 0, eg ) ,
2 ∂ x ∂ 2 σi,t−1 i,t−1 b∗ g∗ 0, . . . , 0, e = 2x , 0 + e i,t−1 ∂θ 2 ∂θ 3 ∂θ 3 ∂θ 2 ∂θ 3 2 ∗ ∂σi,t−1 + 0, . . . , 0, 0, eg , i = 1, . . . , N , ∂θ 3 2 ∂ 2 σi,t
∂ xi,t−1 ∗ ∂σi,t−1 + eg , i = 1, . . . , N , ∂θ 3 ∂θ 3 ∂θ 3 2 2 ∂ 2 σi,t−1 ∂ 2 σi,t ∂ 2 xi,t−1 b∗ ∂ x i,t−1 ∂ x i,t−1 g∗ + e = 2e + x , i = 1, . . . , N . i,t−1 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 ∂θ 3 2 ∂σi,t
∗
2
= 2eb xi,t−1
2 ∂ 2 xi,t ∂ xi,t , can be The second derivatives of xi,t with respect to θ 1 and θ 3 , ∂θ ∂θ or ∂θ ∂θ 1 3 3 1 calculated iteratively for the elements w21 , …, w N ,N −1 , of θ 3 and µ1 , . . ., µ N of θ 1 ∂x by simply calculating the derivative ∂wi,ti j and then multiplying by −1.
123
A Student-t Full Factor Multivariate GARCH Model
75
We also derive the conditional Information matrix for the Student-t
full factor multivariate GARCH model. The parameter vector θ = θ 1 , θ 2 , θ 3 , θ 4 can be denoted by
θ = φ , v , where φ = θ 1 , θ 2 , θ 3 and θ 4 = v. The log-likelihood function takes )T t the form of L T ( y|θ ) = t=1 lt (yt |θ ) and the score function ∂l ∂θ can be written as ∂lt ∂ε = − t Ht−1 εt ∂φ ∂φ
N +v
v − 2 + ε t Ht−1 ε t N +v 1 ∂vec [Ht ] −1 −1 Ht ⊗ Ht vec ε t ε t − Ht + 2 ∂φ v − 2 + ε t Ht−1 ε t
and ∂lt 1 = ∂v 2 +
N +v 2
1 v 1 N 1 1 − − − ln 1 + εt Ht−1 εt 2 2 2v−2 2 v−2
ε t Ht−1 ε t 1 N +v . 2 v − 2 v − 2 + ε t Ht−1 ε t
Then, following thesame steps as in Fiorentini et al. (2003), the conditional InformaIφφ,t Iφv,t tion matrix Iθθ ,t = can be calculated using Iφv,t Ivv,t ∂εt −1 ∂εt v(N + v) H (v − 2)(N + v + 2) ∂φ t ∂φ ∂vec [H ] ∂vec [Ht ] −1 N +v t Ht ⊗ Ht−1 + 2(N + v + 2) ∂φ ∂φ ∂vec [H ] ∂vec [Ht ] 1 t vec Ht−1 vec Ht−1 − , 2(N + v + 2) ∂φ ∂φ ∂vec [Ht ] N +2 vec Ht−1 , = (v − 2)(N + v)(N + v + 2) ∂φ . / N v 2 + N (v − 4) − 8 N +v 1 v 1 − 3 − . = 3 4 2 4 2 2(v − 2)2 (N + v)(N + v + 2)
Iφφ,t =
Iφv,t Ivv,t
The derivatives of Ht with respect to φ can be calculated as follows / ∂vec [W ] . ∂vec [Ht ] ∂vec [t ] = [W ⊗ I N ] t ⊗ I N + [W ⊗ I N ] [I N ⊗ W ] ∂φ ∂φ ∂φ . / ∂vec W + [I N ⊗ (W t )] , ∂φ [t ] can be easily computed using where the elements of ∂ vec ∂φ ∂ vec[W ] the elements of consist of ones and zeros.
2 ∂σi,t , ∂φ
i = 1, . . . , N and
∂φ
123
76
K. Diamantopoulos, I. D. Vrontos
Table 1 Summary statistics for the rates of return of the analyzed stocks Summary statistics Stocks
FA BDK
|yt |
yt2
t−1/2 εt H
LB(50)
LB(50)
LB(50)
JB
58.9
348.9†
246.0‡
1221.9∗
62.4
906.4†
247.4‡
10962.4∗
92.7‡
2546.2∗
yt Mean
SD
0.000017
0.0339
0.000450
Kurtosis 6.960
0.0228
10.106
IBM
0.000581
0.0178
8.526
50.9
237.3†
BAX
0.000478
0.0174
7.810
60.8
169.6†
137.6‡
3963.8∗
61.1
403.3†
478.8‡
1196.7∗
60.7
636.3†
301.9‡
2563.7∗
321.2‡
571.9∗
84.7‡
8746.7∗
DIS ATT
0.000497 0.000374
0.0166
6.344
0.0152
8.448
K
0.000299
0.0145
6.310
51.9
369.4†
EXC
0.000251
0.0132
11.059
55.7
140.5†
The summary statistics include mean, standard deviation (SD), kurtosis, and the Ljung-Box test statistic (LB) based on 50 lags for the autocorrelation of the rates of return, the absolute and the squared returns. † and ‡ indicate that the null of no autocorrelation is rejected at 5% level of significance (the corresponding critical value is 67.5) for the absolute and the squared returns, respectively. The Jarque-Bera test statistic (JB) for normality of the GARCH standardized residual series is also presented in the last column.∗ Indicates that the null hypothesis of normality is rejected at 5% level of significance (the corresponding critical value is 5.99)
Having calculated analytically the score, the Hessian and the Information matrix of the model parameters, the maximum likelihood estimates and the corresponding standard errors can be computed by applying the mixed algorithm described.
4 Application to Eight Stocks from the US Market In this section, we present an empirical application of the proposed full-factor Student-t multivariate GARCH model to financial series. We expect that the use of the Student-t error distribution will be more appropriate for capturing the stylized facts of financial series and will provide better fit to the analyzed returns than other distributional alternatives that are usually used in the literature, i.e., the normal distribution. We illustrate the full-factor multivariate GARCH models using eight stocks from the US market, namely: the Fairchild Corporation (FA), the Black & Decker Corp. (BDK), the International Business Machines Corp. (IBM), the Baxter International Inc. (BAX), the Walt Disney Company (DIS), the AT&T Inc. (ATT), the Kellogg Company (K), and the Exelon Corp. (EXC). The data consists of 2350 daily observations over the 1/1/1990–1/1/1999 period. Let Si,t be the value of the ith stock at time t. We model S
i,t the rates of return yi,t = ln Si,t−1 , i = 1, . . . , N = 8, t = 1, . . . , T = 2349. A brief discussion on the characteristics of the analyzed stock return series follows. Table 1 presents summary statistics for the rates of return of the eight stocks together with the Ljung-Box statistic computed for the rates of return yi,t , the absolute rates and for the squares of the returns. There are stock return series with relatively high
123
A Student-t Full Factor Multivariate GARCH Model
77 BDK 0.05 -0.05
Rates of Return
-0.15
0.1 0.0 -0.1
Rates of Return
0.2
FA
0.0 0.05 -0.10 0.05 0.10
Rates of Return
0.0 0.05 0.10
0.0 0.05 0.10
Rates of Return
EXC
-0.10
0.05 0.0 -0.05
Rates of Return
K
-0.10
ATT
-0.10
-0.10
Rates of Return
DIS
0.0
0.0 0.05
Rates of Return
BAX
-0.10
Rates of Return
IBM
Fig. 1 The analyzed rates of return for the eight stocks of US market
volatilities, namely the FA and the BDK, while the remaining stocks have relatively low standard deviations. There is high kurtosis in stock returns, varying from 6.31 for K to 11.059 for EXC, indicating fat tails in the return distribution. It is possible that a Student-t distribution, that has heavier tails than the Normal, will be more suitable for modelling the return distribution. We also examine if there is evidence for heteroscedasticity in stock returns. The Ljung-Box statistic is computed based on 50 lags on the series as well as the squared and the absolute returns. These statistics show that there is no autocorrelation in the return series, but there is a high level of autocorrelation in the squared values of returns and mainly in the absolute values; the null hypothesis of no autocorrelation is rejected at 5% level of significance for all series. In Fig. 1, we present the rates of return of the analyzed stocks; it is obvious that the volatility of the series changes over time. This is also confirmed by the autocorrelation plot of the squared and the absolute returns, which are depicted in Fig. 2. Furthermore, in Fig. 3 we present the autocorrelation plots of the cross-products of the series which appear to be meaningful in many cases, indicating that the covariances are also time-varying. The above analysis reveals the presence of features, i.e., fat tails and the volatility clustering phenomenon, that multivariate GARCH models are invented to deal with.
123
78
K. Diamantopoulos, I. D. Vrontos
20
30
40
50
0
10
20
30
40
50
10
20
30
Lag
DIS^2
ATT^2
K^2
40
50
0
10
20
30
40
50
40
50
40
50
40
50
Lag
EXC^2 0.04
ACF 0.05
0.0
0.15
ACF
0.10
ACF
-0.04
-.005
0.0
0.0
0.05
0.05
0.15
0.15
Lag
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
20
30
Lag
Lag
Lag
abs(FA)
abs(BDK)
abs(IBM)
abs(BAX)
50
20
30
Lag
abs(DIS)
abs(ATT)
40
50
ACF 0
10
20
30
40
50
0
0.08
ACF
0.10
-0.04
0.0
0.0
0.05
ACF
30
abs(EXC)
0.15
0.15 0.0
20
Lag
abs(K)
0.10
ACF
10
Lag
0.05
0.10 0.05 0.0
0.05
0.10
10
0.0
0.0
0
Lag
0.04
40
0.05
ACF
ACF 0.05
30
-0.05
20
0.15
0.10
0.15
0.10 0.0
10
0.15
0
10
Lag
0.05
ACF
0.10
ACF
0.0
0
Lag
0.15
0
ACF
0.05
0.10 0.0
0.0
10
BAX^2
0.05
ACF
0.10
ACF
0.05
0.10 0.05
ACF
0.0
0
ACF
IBM^2 0.15
BDK^2 0.15
FA^2
0
10
20
30
Lag
40
50
0
10
20
30
Lag
40
50
0
10
20
30
Lag
40
50
0
10
20
30
Lag
Fig. 2 Autocorrelation of the squared and the absolute rates of return
We apply the proposed full-factor multivariate GARCH models based on Normal and Student-t errors presented in the previous sections. The main goal of our empirical analysis is to investigate if the Student-t error distribution is more appropriate than the Normal distribution, i.e., if it better captures the stylized facts of stock returns and provides better fit to the data. To estimate the parameters of the multivariate full-factor GARCH model under conditional Normality we apply the Fisher scoring algorithm of Vrontos et al. (2003a), while to estimate the multivariate model under the Student-t distribution we apply the mixed-gradient approach based on the BFGS algorithm and a Newton–Raphson type algorithm. Tables 2 and 3 present the estimates of the model parameters and their standard errors (based on the information matrix) for the multivariate Normal and Student-t distribution, respectively. Looking at the parameter estimates of Tables 2 and 3 we can draw the following conclusions. The transformed parameters α1∗ , α2∗ , …, α ∗N , b∗ , g ∗ in the variance equations are significant for both the multivariate Normal and Student-t errors, indicating the time-varying volatility phenomenon. However, the parameter estimates under conditional Normality are higher
123
A Student-t Full Factor Multivariate GARCH Model FA*IBM
FA*BAX
10
20
30
40
50
10
20
30
40
50
10
20
30
40
0
40
50
40
50
40
50
40
50
40
50
40
50
40
50
BDK*IBM
20
40
50
10
20
BDK*DIS
50
40
50
0.15
0
10
20
30
Lag
BDK*ATT
BDK*K
0.0
ACF
0.10
ACF
0.15
0.10
0.25
40
30
0.05
ACF 0
Lag
0.05
ACF 30
-0.05
20
30
-0.05
-0.04
10
Lag
0.05
10
0.0
ACF
0.02
0
0
10
20
Lag
30
40
50
0
10
20
Lag
40
50
0
10
20
Lag
IBM*BAX
30
Lag
IBM*DIS
IBM*ATT
40
50
40
50
10
20
30
40
50
0.15
50
20
30
40
50
10
20
30
40
50
50
0.15
ACF
0.15
0.0
30
40
50
0
10
20
30
Lag
K*EXC
10
20
30
40
50
0.04
ACF
0.04
-0.04
0.0
ACF 0
-0.04
40
30
0.08
0.25 0.15
ACF 30
Lag
20
ATT*EXC
-0.05
20
20
Lag
0.05
0.04
10
10
ATT*K
0.0
0
0
Lag
DIS*EXC
10
DIS*K
0.05
ACF 0
0
Lag
0.25
50
-0.05
40
Lag
30
BAX*ATT
DIS*ATT
0.04
ACF 30
-0.04
20
20
Lag
0.0
0.04
10
10
0.10
10
BAX*EXC
0.0
0
0
Lag
ACF 0
Lag
BAX*K
40
0.15
ACF 0
0.05 -
-0.05
30
Lag
30
0.05
ACF 20
BAX*DIS
0.0
ACF
0.10
20
0.15
IBM*EXC
0.0
10
10
Lag
0.05
IBM*K
0
0
0.05 -
30
Lag
0.0
20
-0.10
10
0.05
ACF 0
0.05
50
0.0
40
0.05 -
30
Lag
0.05
20
-0.05
10
0.15
ACF
0.0 -0.06
0
0.05
0.04
0.25
BDK*EXC
30
0.05
50
0.0
0
ACF
30
Lag
0.0
40
BDK*BAX
-0.04
20
0.04
30
-0.02
ACF 20
-0.06
10
Lag
ACF
10
FA*EXC
0.10 0.05
0
-0.04
0.05
50
Lag
FA*K
0.0
ACF
ACF 0
Lag
FA*ATT
ACF
-0.05
-0.04
0
Lag
ACF
0.04
0.08
0
ACF
FA*DIS
0.0
ACF
0.04 -0.04
0.0
ACF
0.05 0.0 -0.05
ACF
0.10
FA*BDK
79
0
10
Lag
20
30
Lag
40
50
0
10
20
30
Lag
Fig. 3 Autocorrelation of the cross product of the rates of return
than the respective parameters under the assumption of Student-t distributed errors; this may be attributed to the fact that, under conditional Normality, an assumption that does not account for fat tails, the model tries to capture the fat tails of the series through the GARCH specification. On the other hand, under the Student-t multivariate GARCH model, the degrees of freedom parameter, v, is estimated to be 6.4, indicating heavy tails. Note also that almost all of the parameters in matrix W , which affect the conditional covariances/correlations, are significant for both Normal and Student-t errors, indicating that the covariances are time-varying. Having estimated the model parameters, one can examine the appropriateness of the assumption of conditional normality. To this end, we apply the Jarque–Bera test t−1/2 εt ), and present in Table 1 of normality to GARCH standardized residuals ( H (last column) the Jarque–Bera statistic. This statistic is very high for all the standardized residual series, which shows that the null hypothesis of normality is rejected
123
80
K. Diamantopoulos, I. D. Vrontos
Table 2 Estimates and standard errors of the parameters of the full-factor multivariate GARCH model under the assumption of normal errors Estimates µ1
0.0001
SE
Estimates
0.0003
b∗
−2.45
SE 0.060
Estimates
SE
w65
0.18
0.018
µ2
0.0007
0.0002
g∗
−0.20
0.013
w71
0.03
0.008
µ3
0.0008
0.0003
w21
0.04
0.013
w72
0.09
0.012
µ4
0.0006
0.0003
w31
0.05
0.010
w73
0.11
0.016
µ5
0.0006
0.0003
w32
0.12
0.015
w74
0.16
0.016
µ6
0.0003
0.0003
w41
0.05
0.010
w75
0.15
0.018
µ7
0.0004
0.0003
w42
0.12
0.015
w76
0.11
0.019
µ8
0.0003
0.0003
w43
0.15
0.020
w81
0.01
0.008
−9.17
0.088
w51
0.04
0.010
w82
0.06
0.011
α1∗ α2∗ α3∗ α4∗ α5∗ α6∗ α7∗ α8∗
−9.98
0.085
w52
0.15
0.014
w83
0.08
0.015
−10.45
0.087
w53
0.17
0.018
w84
0.10
0.015
−10.42
0.087
w54
0.15
0.018
w85
0.07
0.016
−10.63
0.088
w61
0.04
0.008
w86
0.13
0.018
−10.87
0.087
w62
0.11
0.013
w87
0.11
0.019
−10.94
0.088
w63
0.12
0.016
−11.04
0.086
w64
0.15
0.016
Table 3 Estimates and standard errors of the parameters of the full-factor multivariate GARCH model under the assumption of Student-t errors Estimates
SE
Estimates
SE
Estimates
SE
µ1
−0.0002
0.00003
b∗
−2.97
0.072
0.17
0.018
0.0004
0.00001
g∗
w65
µ2
−0.09
0.007
w71
0.03
0.009
µ3
0.0005
0.00001
w21
0.04
0.013
w72
0.10
0.014
µ4
0.0005
0.00001
w31
0.04
0.011
w73
0.12
0.017
µ5
0.0004
0.00001
w32
0.11
0.016
w74
0.17
0.017
µ6
0.0002
0.00001
w41
0.04
0.011
w75
0.14
0.019
µ7
0.0002
0.00001
w42
0.13
0.017
w76
0.12
0.021
µ8
0.0003
0.00001
w43
0.13
0.021
w81
0.01
0.008
−10.07
0.114
w51
0.03
0.011
w82
0.06
0.013
−11.06
0.119
w52
0.16
0.016
w83
0.08
0.016
−11.38
0.116
w53
0.18
0.020
w84
0.11
0.015
−11.29
0.114
w54
0.15
0.019
w85
0.08
0.017
−11.46
0.112
w61
0.03
0.009
w86
0.13
0.019
−11.80
0.115
w62
0.12
0.014
w87
0.12
0.019
−11.77
0.112
w63
0.13
0.017
v
6.40
0.221
−11.96
0.113
w64
0.15
0.017
α1∗ α2∗ α3∗ α4∗ α5∗ α6∗ α7∗
α8∗
123
A Student-t Full Factor Multivariate GARCH Model
81
at 5% level of significance. Thus, the residual series exhibit deviations from normality. We also apply the Lagrange multiplier test for multivariate normality versus multivariate Student-t distribution proposed by Fiorentini et al. (2003). Under the null ) T −1/2
S
t t hypothesis of normality the test statistic, τTI , is calculated by τTI = √ N (N +2)/2 , where
2 N (N +2) N +2 −1 1 −1 ε t Ht εt + 4 ε t . The value of this statistic equals εt H St = − 2 4 105.94, which indicates that the hypothesis of normality is rejected either using a one-sided or a two-sided critical value. Therefore, it is worth estimating a multivariate GARCH model based on the Student-t error distribution than the normal. Furthermore, we examine the adequacy of the multivariate GARCH models by using diagnostics on the innovations. To determine the appropriateness of the multivariate Normal or the Student-t distribution we use a simple visual diagnostic as in Kawakatsu (2006). The idea is to plot the sample quantiles of the standardized residuals against the theoretical quantiles (qq-plots) under each distributional assumption, and the sample cumulative distribution function (cdf) of standardized residuals versus the corresponding cdf of the theoretical distribution (pp-plots). If the points are scattered closely around the 45◦ line, this indicates a correct distributional assumption. More specifically, under the multivariate Normality assumption for the error process, i.e., ε t ˜N (0, Ht ), then ε t Ht−1 ε t has a X 2 (N ) distribution, and in this case we check t−1 whether εt H ε t follows a chi-square distribution. Under the multivariate Student-t error distribution, i.e., εt ˜t (0, Ht , v), then (v/N (v − 2)) ε t Ht−1 ε t has an F(N , v) dist−1 tribution, and we plot the sample quantiles and cdf of ( v /N ( v − 2)) ε t H ε t versus those from the F(N , v ) distribution. In Fig. 4, we illustrate these plots; the left two graphs present the distribution t−1 ε t taken from the full-factor multivariate GARCH Normal model, while of εt H t−1 the right two graphs present the distribution of ( v /N ( v − 2)) ε t H ε t taken from the full-factor multivariate GARCH Student-t model. The top two graphs illustrate the qq-plots, i.e., the sample quantiles against the chi-square quantiles (Normal case) and the sample quantiles against the F quantiles (Student-t case). The bottom two graphs t−1 ε t against the X 2 (N ) cdf (Norare the pp-plots, i.e., plots of the empirical cdf of ε t H t−1 mal case) and the empirical cdf of ( v /N ( v − 2)) ε t H ε t against the F(N , v ) cdf (Student-t case). Figure 4 indicates that the Student-t full-factor multivariate GARCH model provides a much better fit than the multivariate Normal model. Combining the above results, we conclude that the multivariate Student-t model explains the stylized facts, i.e., heavy tails and time varying variances/covariances, observed in the series and therefore provides a more appropriate modelling approach of the underlying dynamics of financial series.
5 Discussion In this study, we consider the problem of estimation and inference in multivariate GARCH models. In particular, we extend the full-factor multivariate GARCH model of Vrontos et al. (2003a) and propose the use of a multivariate Student-t distribution for the error process. The estimation of the parameters of the Student-t multivariate GARCH model is done by using classical techniques. Maximum likelihood estimation
123
K. Diamantopoulos, I. D. Vrontos 8 6 4 0
2
F Quantiles
15 10 5 0
X-sq Quantiles
20
82
0
5
10
15
20
0
4
6
8
F Cdf
0.0
0.4
0.8 0.4 0.0
X-sq Cdf
2
Sample Quantiles
0.8
Sample Quantiles
0.0
0.2
0.4
0.6
Sample Cdf
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Sample Cdf
t−1 Fig. 4 Distribution of εt H ε t taken from the multivariate GARCH Normal model (left two plots) and of −1 v /N ( v − 2)) εt Ht εt taken from the multivariate GARCH Student-t model (right two plots). The top ( two plots are the qq-plots, while the bottom two plots are the pp-plots
is implemented by using a mixed-gradient algorithm. We provide analytical expressions for the derivatives of the log-likelihood function, which improves the numerical accuracy of the estimates while speeding-up estimation. We apply the proposed Student-t multivariate GARCH model to financial series from the US stock market. The results of our analysis show that the proposed model captures the stylized facts of the return series and provides a more appropriate modelling approach. Acknowledgments We would like to thank the editor and an anonymous referee for numerous constructive comments that improved the quality of the paper.
References Alexander, C. (2001). Orthogonal GARCH. In C. Alexander (Ed.), Mastering risk (Vol. 2, pp. 21–38). Financial Times–Prentice Hall. Bauwens, L., Laurent, S., & Rombouts, J. V. K. (2006). Multivariate GARCH models: A survey. Journal of Applied Econometrics, 21, 79–109. Berndt, E. K., Hall, B. H., Hall, R. E., & Hausman, J. A. (1974). Estimation and inference in nonlinear structural models. Annals of Economic and Social Measurement, 3/4, 653–665. Bollerslev, T. (1987). A conditionally heteroskedastic time series model for speculative prices and rates of return. Review of Economics and Statistics, 69, 542–547. Bollerslev, T. (1990). Modelling the coherence in short-run nominal exchange rates: A multivariate generalized ARCH model. The Review of Economics and Statistics, 72, 498–505. Bollerslev, T., & Wooldridge, J. M. (1992). Quasi maximum likelihood estimation and inference in dynamic models with time varying covariances. Econometric Reviews, 11, 143–172.
123
A Student-t Full Factor Multivariate GARCH Model
83
Bollerslev, T., Chou, R. Y., & Kroner, K. F. (1992). ARCH modeling in finance—a review of the theory and empirical evidence. Journal of Econometrics, 52, 5–59. Brooks, C., Burke, S. P., & Persand, G. (2001). Benchmarks and the accuracy of GARCH model estimation. International Journal of Forecasting, 17, 45–56. Compte, F., & Lieberman, O. (2003). Asymptotic theory for multivariate GARCH processes. Journal of Multivariate Analysis, 84(1), 61–84. Engle, R. F. (1982). Autoregressive conditional heteroskedasticity with estimates of the variance of UK inflation. Econometrica, 50, 987–1008. Engle, R. F. (2002). Dynamic conditional correlation—a simple class of multivariate GARCH models. Journal of Business and Economic Statistics, 20, 339–350. Fiorentini, G., Calzolari, G., & Panattoni, L. (1996). Analytic derivatives and the computation of GARCH estimates. Journal of Applied Econometrics, 11, 399–417. Fiorentini, G., Sentana, E., & Calzolari,G. (2003). Maximum likelihood estimation and inference in multivariate conditionally heteroscedastic dynamic regression models with Student-t Innovations. Journal of Business and Economic Statistics, 21, 532–546. Goldfarb, D. (1970). A family of variable-metric methods derived by variational means. Mathematics of Computation, 24, 23–26. Hafner, C. M., & Herwartz, H. (2003). Analytic quasi maximum likelihood inference in multivariate volatility models. Econometric Institute Report EI 2003-21. Rotterdam: Erasmus University. Jeantheau, T. (1998). Strong consistency of estimators for multivariate ARCH models. Econometric Theory, 14, 70–86. Kawakatsu, H. (2006). Matrix exponential GARCH. Journal of Econometrics, 134, 95–128. Lanne, M., & Saikkonen, P. (2007). A multivariate generalized orthogonal factor GARCH model. Journal of Business and Economic Statistics, 25, 61–75. Laurent, S. (2004). Analytic derivatives of the APARCH model. Computational Economics, 24, 51–57. Ling, S., & McAleer, M. (2003). Asymptotic theory for a vector ARMA-GARCH model. Econometric Theory, 19, 280–310. Lucchetti, R. (2002). Analytical score for multivariate GARCH models. Computational Economics, 19, 133–143. Mak, T. K. (1993). Solving non-linear estimation equations. Journal of Royal Statistical Society B, 55, 945–955. Mak, T. K., Wong, H., & Li, W. K.(1997). Estimation of nonlinear time series with conditional heteroscedastic variances by iteratively weighted least squares. Computational Statistics and Data Analysis, 24, 169–178. McCullough, B. D., & Renfro, C. G. (2000). Some numerical aspects of nonlinear estimation. Journal of Economic and Social Measurement, 26, 63–77. McCullough, B. D., & Vinod, H. D. (1999). The numerical reliability of econometric software. Journal of Economic Literature, 37, 633–665. Nelson, D. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59, 347–370. Shanno, D. F. (1970). Conditioning on quasi-Newton methods for function minimization. Mathematics of Computation, 24, 647–656. Tse, Y. K., & Tsui, A. K. C. (2002). A multivariate generalized autoregressive conditional heteroscedasticity model with time-varying correlations. Journal of Business and Economic Statistics, 20, 352–362. van der Weide, R. (2002). GO-GARCH: A multivariate generalized orthogonal GARCH model. Journal of Applied Econometrics, 17, 549–564. Vrontos, I. D., Dellaportas, P., & Politis, D. N. (2000). Full bayesian inference for GARCH and EGARCH models. Journal of Business and Economics Statistics, 18, 187–198. Vrontos, I. D., Dellaportas, P., & Politis, D. N. (2003a). A full-factor multivariate GARCH model. Econometrics Journal, 6, 312–334. Vrontos, I. D., Dellaportas, P., & Politis, D. N. (2003b). Inference for some multivariate ARCH and GARCH models, Journal of Forecasting, 22, 427–446.
123