Advances in Computational Mathematics 6 (1996) 159-189
159
A subspace preconditioning algorithm for eigenvector/eigenvalue computation* J a m e s H. B r a m b l e and J o s e p h E. P a s c i a k
Department of Mathematics, Texas A&M University, College Station, TX 77843, USA E-mail:
[email protected];
[email protected] A n d r e w V. K n y a z e v
Department of Mathematics, University of Colorado at Denver, P.O. Box 173364, Campus Box 170, Denver, CO 80217-3364, USA E-mail: knyazev @ na-net.ornl.gov Received October 1995; revised September 1996 Communicated by D.N. Arnold
We consider the problem of computing a modest number of the smallest eigenvalues along with orthogonal bases for the corresponding eigenspaces of a symmetric positive definite operator A defined on a finite dimensional real Hilbert space V. In our applications, the dimension of V is large and the cost of inverting A is prohibitive. In this paper, we shall develop an effective parallelizable technique for computing these eigenvalues and eigenvectors utilizing subspace iteration and preconditioning for A. Estimates will be provided which show that the preconditioned method converges linearly when used with a uniform preconditioner under the assumption that the approximating subspace is close enough to the span of desired eigenvectors. AMS subject classification: Primary 65N30; Secondary 65FI0.
1.
Introduction
In this paper, w e shall be c o n c e r n e d with c o m p u t i n g a m o d e s t n u m b e r o f the smallest eigenvalues and their c o r r e s p o n d i n g eigenvectors o f a large s y m m e t r i c ill-conditioned
* This manuscript has been authored under contract number DE-AC02-76CH00016 with the U.S. Department of Energy. Accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U~S. Government purposes. This work was also supported in part under the National Science Foundation Grants No. DMS-9007185, DMS-9501507 and NSF-CCR-9204255, and by the U.S. Army Research Office through the Mathematical Sciences Institute, Comell University.
(~) J.C. Baltzer AG, Science Publishers
160
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
system. More explicitly, let A be a symmetric and positive definite linear operator on an N-dimensional real vector space V with inner product (., .) and norm I1"II = (', ")i/2. The distinct eigenvalues, {hi}, of A are positive real numbers which, along with their respective eigenvectors {vi}, satisfy
Avi
=
~iVi.
The eigenvalues {/~i} are ordered to be increasing with i and are assumed to be simple. The eigenvectors {vi} are orthogonal with respect to the inner product (., .) and chosen so that (Avi, vi) = 1. The algorithms and results of this paper carry over to the case of Hermitian operators on complex inner product spaces without change. We consider the case where the first s eigenvalues are simple and well separated. It seems possible to extend the analysis to the case of eigenvalues of higher multiplicity. The extension of the results to the case of eigenvalues with little separation is somewhat more tedious. If the operator A is a mesh analog of a PDE with multiple eigenvalues, then A has clusters of eigenvalues and this is one of the most interesting practical examples of a bad separation of eigenvalues. However, for this case the operator A can be viewed as a perturbation of an operator with well separated multiple eigenvalues, see [23]. The analysis of such perturbation appears possible for the method described below, but will not be addressed in this paper. We shall be interested in iterative schemes for computing {/~} and {vi}, for i = 1, ..., s, where s is small compared to N. We will only assume that a procedure for evaluating the action of A applied to vectors in V is available. Given a basis for V the corresponding matrix representing A is often large and sparse and a full computer representation, including the zero elements, is not feasible since its size would be too large to manage. We will also avoid the computation of the action of A -1 . There are many methods for obtaining eigenvalues and eigenvectors for symmetric positive definite matrices (see [43]). Methods like the QR-algorithm work extremely well for relatively small matrices. Classical iterative methods involve subspaces of vectors which result from applying A - h i or its inverse with a non-negative parameter/~ which may change from iteration to iteration. Because A is ill-conditioned and we seek the eigenspaces corresponding to the smaller eigenvalues, to be effective, the classical methods require inversions of A - hi. In this paper, we shall study an iterative eigenvalue scheme which utilizes subspace iteration and preconditioning. A preconditioner B is a symmetric positive definite operator on V which "approximates" the inverse of A. For our purposes, we shall assume that B is scaled so that the operator I - B A is a reducer. This means that there is a number 7 in (0,1) satisfying
[ ( A ( I - B A ) v , v ) l <~"/(Av, v)
for all v E V.
(1.1)
Note that (1.1) implies that (1 - 7 ) ( A v , v) <~ (BAy, Av) <~ (1 + 7)(Av, v)
for all v E V.
(1.2)
ZH. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
161
There is a vast literature of techniques for developing preconditioners for symmetric positive definite problems, especially in the case when the operator A is a discretization of an elliptic partial differential equation (see, e.g., [1, 2, 15, 16, 26, 27]). The best preconditioners satisfy (1.1) with 3' bounded away from one (independently of N). In addition, a good preconditioner is economical to evaluate. This means that the cost of computing the action of B applied to an arbitrary vector should not be much greater than that of applying A. When A corresponds to a discretization of a partial differential equation, often low cost preconditioners are known for which (1.1) holds with 3' independent of the mesh size and hence the number of unknowns (see, e.g., [3-6, 12, 20, 52]). Multigrid and domain decomposition are two examples of effective techniques for developing preconditioners for the discrete systems arising from approximations to elliptic boundary value problems (see, e.g., [4-13]). The use of preconditioned iterations for computing the eigenvectors and eigenvalues has been first studied by Samokish [49], and then by Petryshyn [44]. Ruhe [45, 46] clarified the asymptotic connection of such methods and similar preconditioned iterative methods for finding a nontrivial solution of the corresponding Helrnholtz system. Explicit convergence rate estimates, independent of the number of unknowns, for preconditioned iterative methods in the case of one eigenvector were first obtained in the Russian literature, by Godunov et al. [28] and by D'yakonov et al. [24, 25] (see also [21] and the included references). In particular, the base iteration used in our preconditioned subspace iteration was analyzed in [24]. There are a number of alternative preconditioned schemes which have been proposed to further improve convergence of the base method. The possibility of using Chebyshev parameters to accelerate the convergence of two stage preconditioned eigenvalue iterations was discovered by II'in [29]. An analogous idea of using Lanczos' method in the inner stage is due to Scott et al. [41, 50]. Explicit convergence rate estimates, independent of the number of unknowns, for these methods have been established in [31, 32]. The convergence estimates for the two stage method are better than those for the base method when high accuracy is required. The locally optimal preconditioned conjugate gradient method was suggested in [33]. In [33, 35], a new preconditioned variant especially suited for the domain decomposition approach was presented and the corresponding convergence rate estimates were proved. Davidson proposed a diagonally preconditioned subspace method [18] where the dimension of the subspace increased each step of the iteration. Although the original Davidson method is likely to converge fast with a good preconditioner, the cost per iteration and storage requirements grow with the iteration number. This makes the method unacceptable in many large scale applications. Nevertheless, the method became popular in computational molecular physics and was further developed (see, e.g., [19, 36, 39, 40, 42, 48, 51]). Theoretically Davidson-like methods are still not well studied, for example, it is not proved that their convergence does not depend on the number of unknowns with a proper preconditioner, though it seems to be the case. Generalizations of the preconditioned methods for the simultaneous computation of several leading eigenvalues and the corresponding invariant subspaces by using
162
J.H. Bramble et aL / Preconditioned eigenvector/eigenvalue computation
subspace iterations were suggested in [14, 37, 38, 49]. The first and, in fact, the only explicit estimates on the convergence behavior, independent of the number of unknowns, were obtained by D'yakonov and Knyazev (DK) in [21-23] where simplified methods with the same iteration operator for all vectors in a subspace were developed and analyzed. This simplification, however, leads to a method which computes only one eigenvalue, the largest in the group, at a time. To find another, smaller, eigenvalue, the method has to be used again with an initial subspace of a smaller dimension and with an orthogonalization to the previously computed eigenvector. For the DK method, the convergence estimates do not depend on the separation of the eigenvalues. In the present paper we analyze a preconditioned subspace iteration technique. This involves a recursively generated sequence of subspaces. Given a subspace in this sequence, we compute the approximate eigenvectors by applying the Rayleigh-Ritz procedure. The next subspace in the sequence is defined as the linear space spanned by the vectors which result from the application of a simple preconditioned eigenvector iteration procedure to the Ritz eigenvectors. In contrast to the DK method, our iteration operator is different for different Ritz eigenvectors. On the other hand, our method differs from that of [37, 38] by using a fixed-shift basic algorithm, whereas [37, 38] suggest a block version of the steepest descent method [49]. Our method could also be thought of as a truncated version of a block Davidson method. We present a theorem which guarantees convergence of the preconditioned subspace method provided that the starting subspace is sufficiently accurate. As in the classical block inverse power method, the convergence rate estimate for the smaller eigenvalues and their corresponding eigenvectors is better than for eigenpairs whose eigenvalues are closer to As+l. The rates only depend on the first s + 1 eigenvalues, and is independent of the largest eigenvalue Am and/or the number of unknowns. This is crucial for our applications where we seek the eigenvalues of a discrete second order elliptic operator. The largest eigenvalue of such an operator and the number of unknowns tend to infinity like h -2 where h is the mesh parameter. The only disadvantage of our theory is that the accuracy condition on the initial subspace depends on the gap in the first s + 1 eigenvalues, in contrast to the theory of the classical block inverse power method and of the simplified preconditioned subspace DK method. Numerical experiments suggest that the actual convergence of our method is, in fact, independent of the gap and that its overall performance is much better than the DK method. We have chosen the DK method for numerical comparison as it was, before the present paper, the only preconditioned subspace iteration method with proven explicit convergence rate estimates, independent of the number of unknowns. The form of the algorithm proposed was motivated by a need for developing a parallelizable eigenvalue/eigenvector algorithm which would be effective in computing a number of the smallest eigenvalues and their corresponding eigenvectors for large symmetric systems. The scheme is currently being applied to the computation of several hundred eigenfunctions for a problem with several thousand unknowns which arises in first principle electronic structure computations [17]. The outline of the remainder of the paper is as follows. We describe the algorithm in section 2 and give a theorem which bounds its convergence. Section 3 contains
J.H. Bramble et al. /Preconditionedeigenvector/eigenvaluecomputation
163
several useful estimates for the Rayleigh-Ritz method, partially based on results of [32, 34, 47]. The proof of the theorem is given in section 4. Finally, the results of numerical experiments involving the preconditioned subspace technique are given in section 5.
2.
The subspace preconditioning algorithm
In this section, we describe the subspace preconditioning algorithm which will be studied in this paper. This algorithm involves the development of a sequence of subspaces V3n C V, for n = 1 , 2 , . . . , approximating the eigenspace
Vs=span{vl,...,Vs}. The initial approximation subspace V° of dimension s is assumed given. Given an approximation subspace Vs~ of dimension s, the eigenvectors and eigenvalues of A are approximated by computing the Ritz eigenvectors {vn} C Vn along with their corresponding eigenvalues An satisfying
(A@,w) = An(v~,w)
for all w e Vs~.
(2.1)
Here A~ ~< A~ ~< ... <~ A~ and {v n} are mutually orthogonal and normalized so that (Av'~,v'~) = 1, for i = 1 , . . . , s . The new subspace Vsn+l is generated from a basis which is defined by applying a simple preconditioned eigenvalue iteration scheme to the Ritz vectors of the previous subspace Vn . Specifically, let 9 nn+t = @ - B ( A v r ~ - A i vn i )n
for/=l,...,s,
(2.2)
and define Vsn+l = s p a n { ~ + l , . . . ,
Fsn+l }.
The iteration (2.2) has been studied as a stand alone iterative scheme for computing the smallest eigenvalue Al and a vector in the corresponding eigenspace [24]. The above method could also be thought of as the simplest truncated version of a block Davidson method which defines V~ +' = V~ + s p a n { ~ + l , . . . ,
g~ +' }.
We shall need some additional notation to state our convergence estimates. Let (', ")A denote the inner product defined by
(V, W)A = (Av, w)
for all v, w E V.
The corresponding norm will be denoted by IIvlIA
for all v C V.
164
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
Let Qs and Q~ be the (-, ")A orthogonal projection operators onto Vs and Vsn, respectively. We measure the gap between the subspaces Vs and Vsn by considering the norm of the difference of these projectors: IIQ2 -
GILA.
(2.3)
We also define ~
I1 , -
Q~v~I[A,
for i = 1 , . . . , s,
and A--
Ai+l + Ai max i=l,...,s Ai+I - A~
The main result of this paper is given in the following theorem. T h e o r e m 2.1. Suppose that the initial approximation subspace V° satisfies S
(o°) i=1
(1 - 3,)2 (Al)4(1 - As/As+l )4 ~<
1999A2(As)4
,
(2.4)
where 7 satisfies (1.1). Then the dimension of Vsn for n > 0 is equal to s. Moreover, we have 1.03 -2n 0 2 I1(±~ 1 6i (oi) (2.5)
P,)
Ai/As+l
and O~I-Ai/A~
1.03 X.an (09)2 1 - A i / A s + I - * t-~: '
(2.6)
where 5i = 3' + (1 - 7)A~/As+I < 1, +
(1 ~s)-
(As+1- As) 1/2 Ai
(2.7) < 1.
(2.8)
In (2.5), Pi denotes the orthogonal projector onto the one-dimensional eigenspace spanned by vi.
Remark 2.1. Note that 6i is less than or equal to 6 / + (1 - 6i)/2 which is clearly less than one. In addition, the convergence rate estimate 6i is closer to ¢fi for smaller eigenvalues since Ai is smaller than A8 in that case. Remark 2.2. Note that [Iv~llA = 1 and hence (2.5) implies that the eigenvector v~ converges to the eigenvector vi exponentially with n.
J.H. Bramble et al. / Preconditioned eigenvector/eigenvatue computation
165
Remark 2.3. The above theorem can be applied to the problem of the approximation of relatively few of the lowest eigenvalues and their corresponding eigenvectors of an elliptic boundary value problem defined on a bounded domain in the case when the, corresponding lower eigenvalues are simple. It is well known that these eigenvalues are distinct. Moreover, these eigenvalnes and eigenvectors can be approximated by those which result from finite element discretization. The eigenvalues of the discrete system exhibit behavior similar to those of the continuous problem. They are well separated provided that the mesh parameter h is suitably small since the discrete eigenvalues converge to those of the continuous problem. If A = An is the system corresponding to the discretization with mesh size h then the parameters ,~s, A1, A can be bounded independently of the mesh parameter h. If one uses a preconditioner B = Bh which satisfies (1.1) with 7 < 1 (not depending on h) then the above proposition provides asymptotic rates of convergence which do not depend on h and hence the dimension of the underlying system.
Remark 2.4. With a slightly more complicated proof, it is possible to reduce the constant 1999 to 500 in the above theorem. Of course, even the latter constant is still too large to convince someone theoretically that our method is useful. However, the method appears to be much more robust in practice. In our experience, the method converges with initial subspaces consisting of vectors with random entries (see section 5). Such an initial subspace fails to satisfy the accuracy condition (2.4). The following corollary shows that the rates of convergence for the given eigenvalue/eigenvector tend asymptotically to 6i as the V~ converges to Vs. Corollary 2.1. Assume that the hypothesis of theorem 2.1 holds. Then for any e E (0, 1] there exists an m = re(e) such that for n ~> m,
11(I
n 2
tlA <
-
1.03 1- AjAs+,
(~i+E(-~i_~.~2n-2m-~2 m ~//
i
(00) 2
and o <. 1
1.03
(5~ + e(Si - Si) )2n-2m-62m(O0) 2,
-
1 - ~i/~s+l
where 5~ and 5i are as in (2.7) and (2.8).
3.
Properties of the Ritz approximation
In this section, we give some lemmas which describe the approximation properties of the eigenvalues and eigenvectors resulting from the Ritz subspace method. These lemmas only depend on the distribution of the eigenvalues of A and the approximation properties of the subspace. Thus, we shall state them in terms of an arbitrary approximation subspace V C V of dimension less than or equal to s.
166
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
The Ritz eigenvectors {~i} C V" along with their corresponding eigenvalues Ai satisfy (A~i,w) = Ai(v~,w) for all w E V. Here Al ~< A2 ~< "'" and the vectors in {vi) are mutually orthogonal and normalized so that (A~i, v'i) = 1 for each i. Let Q be the (., ")A orthogonal projection operator onto V and set
~- IIQ -
Q~IIa-
It will also be importantto measure how well the actual eigenvectors can be approximated in the subspace V. To this end, we define the quantities
- ltv~- ~v, llA,
for i = 1,... ,s.
The eigenvalue accuracy can be estimated in terms of the above parameters by the following lemma. L e m m a 3.1. If b - < 1,
(3.1)
then the dimension of V is s and for / = 1 , . . . ,s,
0~< l - A i / A i ~ < ~ 2 . The proof of the above lemma follows [32] and uses two additional lemmas. The first is essentially given in [30, theorem 6.34]. L e m m a 3.2. Let P and Q be two orthogonal projectors with M = Range(P), N = Range(Q) such that dim(N) ~< dim(M) and I1(± - Q)PII = ~ < 1,
Then dim(N) = dim(M) and liP - QII =
II(z -
P)QI] = ~.
L e m m a 3.3. Let Vi be the space spanned by the eigenvectors vj, for j = 1 , . . . , i. We define the additional operators Qi and Qi to be the(., ")A orthogonal projectors onto the spaces ~ and ~ = Range(~)Vi). Assume that 0 < 1. Then for i ~< s, the maps
are isomorphisms. Moreover,
IIQ~ - Q~IIA--
I1(± - 0~)Q~IIA
(3.2)
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
167
Proof By definition of Qi, Qiv = ~)v
for all v E Vi.
Thus,
II(i
Q ' i )- Q i . .II A = ,o~Vi max ' ll'OllA=l
II~-Q,VlIA
= 1)~Vi,max II'OIIA~---I II~-Q~IIA
max IIv--O~lla = ~ <~vEVs, Ilvl[m=l This proves the last equality of (3.2). The lemma now follows from lemma 3.2.
[]
Proofoflemma 3.1. Let i be less than or equal to s and v* E V/ be the eigenvector with maximal eigenvalue A* for the following eigenproblem:
(Av, w) = A(v,w)
for all w C ~ .
By lemma 3.3, the dimension of ~ is equal to i. By the Courant-Fischer theorem, Ai ~< ~i <~ A*
(3.3)
from which the lower bound of the lemma follows. For the upper bound, it suffices to bound the quantity 1 - ,~i/A*.
We may assume that ]]v*IIA =
1. W e
decompose
v* = Q~v* + c2~*. Let Rq(w) = (Aw, w)/liwH 2 denote the Rayleigh quotient. Then 1 - ,x~/,~* = fix* - Adllv*ll 2.
Now it follows from (3.2) that Q~v* # 0. Clearly,
1 = Rq(Q~*)IIQ,v*11 ~ + JiQ?~*tI~ ~ ~IIQ~*II 2 + IIQ~v*ll~ •
(3.4)
If Qi± v * = 0 then v* E 1~. In this case, it follows that A* = Ai and there is nothing more to prove. If Qi± v * # 0 then (3.4) implies 1 ~< AilIQiv*]I 2 + Rq(Q{v*)llQ{v*tl 2 = Aillv*llz + (Rq(Q{v*) -
ai)llQ?v*tl 2
Consequently,
1 - ,~il,k* ~ ( R q ( Q ~ v * ) -
~0 IIQf~*ll~
= [i - ~ / R q ( Q S ~ * ) ] I I Q ~ * I I ~
~ tlQS~*II~
< II(z - Q~)Q~II~ ~ ~2 where we used lemma 3.3 for the last inequality. lemma 3.1.
This completes the proof of []
The next lemma provides a relationship between 0 and 0/.
168
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
Lemma 3.4.
i12
Moreover, if
II(s- O)Q~IIA<
(3.5)
1,
then dim(V) = s and for i = 1 , . . . , s,
~ o = 11~2- QSNA =
I1(z - Q)QsIIA = N(s - QDQIIA.
(3.6)
Proof Let v be in Vs and write $
v = ~-~ cjvj. j=t
Clearly
tP(z-
)VllA =
~ s c j ( I - Q)v~ l j=l
s <<.~-~lcjtll(I-Q)v311A A
j=t
~< It follows that
1/2
[]
The lemma follows from lemma 3.2. Lemma 3.1 implies that for sufficiently small 0, . . . , s} and i # j. Thus, we can define the quantities N
foriCj,
ffij =
)~j #
,k~ whenever i , j E {1,
i,j = l,...,s.
Let ai =
max
j=l,...,s, j#i
(aij).
(3.7)
The next lemma provides bounds for the eigenvector error in terms of the approximation parameter Oi. Its proof is essentially based on the analysis given in [47] and is included for completeness.
169
J.H. Bramble et al. /Preconditioned eigenvector/eigenvalue computation
Lemma 3.5. Assume that (3.5) holds and that for a fixed index i • [1,..., s], ~j ¢ hi for j ¢ i and j • [1,..., s]. Then, I
~2
~ N2
lt(±-P,)~,ll ~.< 1+ (,Xl)~ °z -z ~j-~ ]~-c,0, Here Pi is the projection onto the ith eigenvector vi.
Proof Let .Pj denote the (., ")A orthogonal projector onto the subspace generated by ~j. Note that
II( ±- Pd~illA = I1(z- PdP'IIA = II( z- P~)P'IIA --I1( z- P')v~IIA, and that
I1(±- ~),,11~ = I1(±- ~)~,11~+ I1(~- ~,)~,11~. =o~ + It(Q- ~)~,tI~..
(3.8)
We estimate the last term above as follows: Note that s s 1 /3j" Q=~Pj and Q)A-I~)--~--~. j=l j=l Thus, 1
t
j= 1, j#i
Aj ,/ =QA-'(I-Q,)vi.
(3.9)
The two terms on the left of (3.9) are (., ")A orthogonal and hence
lll(~-~dv~th<
~
j=l,
j~i
L ~ ~jv~
<. II~A -~ (I - ~)IIAII (I
A
-
~))v~ll A.
(3.1o)
In the last product the second term is ~ by definition and it remains now to estimate the first term. Since the projector Qs commutes with A -1, lemma 3.4 and the triangle inequality give l[~)a-I ( I - Q ) I I a = I[~)(A-1- 2IA1I ) ( I - ~ ) ) J i m
nt- O,(A-I - 2,~II)Qs(I-Q)
A ~ -~I.
(3.11)
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
170
Combining (3.10) and (3.1 I) gives
II(Q--~,)~ill~ <
~2 0"~" ~'2~2
A12
-
-~.
The lemma follows combining (3.8) and (3.12).
(3.12) []
The error in the corresponding eigenvalue is second order in 0/ and is bounded by the following lemma. Lemma 3.6. Assume that the hypotheses of lemma 3.5 hold. Then,
Proof Using the identity Ai =
I1~!1-2 gives
1 - A,/Ai = ((A-
~)~'/, vi).
We clearly have ( ( A - A~)~, ~i) = ( A ( I - P~)~i,(I- P~)'~) - A i ( ( I - P~)~i, (I - Pi)~)
.< I1(±- P~)~ll~ This completes the proof of lemma 3.6.
[]
Both the eigenvectors {vi} and the corresponding approximate eigenvectors {~i} are orthonormal with respect to the (., ")A inner product. The following lemma, based on [34], shows that for i ~ j, I(vi,~j)A] is small with 0 since
[(vi,'Vj)A I ----[((Qs- Pj)Vi,'Vj)A I <~H(Qs- PJ)'VjNALemma 3.7. Assume that (3.5) holds and that for a fixed index j E [1,..., s], Aj • Ai for i ¢ j and j E [1,...,s]. Then, N --2 ~2 II(Q~- PJ)~;ll~ < c2o oj,
where
02- (~;)201 (/~1)2
'
and ~ =
max (Yij). i=l,...,s, i~j
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
Proof
171
We clearly have
1
II(Q=- PJ)vjllA
<~ II(Q,-
Pj)(A-'-
~71]')v'JIIA .
Since
((A-'-'A-~'I)~j,W)A=O
for all w E V,
(3.13)
it follows that
Pj)(A-' - "A-~ I)'~JIIA = II(Qs
n(Qs -
- P~)(± - ~)( A-I -- ~ ~-1 i)~-j [IA
-< II(Qs - P,)(± - Q)IIAtl ( A-~ - x;,i)~j IIA Now by lemma 3.4,
II(Q~ - P,)(i - Q)IIA -< IIQ~(I- Q)IIA --I1(I - Q)Q~IIA -- ~ In addition, (3.13) implies that ]l( A - ' -
"A-j'I)~jIIA <~N(A - ' < [[ (A-' -
Aj-'I)v'jIIA = II(A - ' -
Aj-11) (I -
PJ)~JIIA
V'I)IIAII(I- PJ)~JIIA
The lemma follows combining the above estimates with lemma 3.5.
[]
The final lemma of this section shows that the approximation parameter 0"i can be bounded in terms of the orthogonal component of the approximate eigenvector. Lemma 3.8. If C2y2 • 1 then 0"~ ~< (1
Proof
2 -Go- - 2 ) - l IlQ,_1_v, IIA.
It follows that
=
-Pi)V, Na.
Thus, applying lemma 3.7 gives ~2 ~2
The lemma immediately follows.
[]
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
172
4.
Convergence analysis
In this section, we will prove the theorem and corollary stated in section 2. Their proofs are based on three additional lemmas as well as the lemmas for the Ritz eigenpairs. We shall require some additional notation for this section. If On is sufficiently small, lemma 3.1 implies that A~ 7~ Ai whenever i,j E { 1 , . . . , s} and i 7~ j. Following section 3, we define the quantities
~rq(n)-
I1 - A~/A21
f o r i T ~ j , i,j = l , . . . , s ,
and set ¢ri(n)
=
max (crij(n)) j=l,...,s, j#i
and
c~;(n)=
max (aq(n)). i=l,...,s, iCj
Let V~ denote the A orthogonal complement of Vs. For each i in [ 1 , 2 , . . . , s], let [', "]i denote the inner product
[v,w]i = ( ( A - AJ)v,w)
for all v,w E V~.
[..] 1 / 2
. It is equivalent
II1~1t[~, ~ e V~.
(4.1)
•
The corresponding norm (on V~) will be denoted by Ill Illi = t, ,i to the original norm,
II1~111~~< II~IIA ~< \ A ~
/ks+1 ) 1/2 --A~
We start by considering the effect of a related iteration operator on the complement of Vs. For each i in [ 1 , 2 , . . . , s], let
E~ =
I - Q ls
B(A
(4.2)
-AiI).
Note that Ei is symmetric on Vsx with respect to the inner product [-, "]i. Moreover, it is a reducer as guaranteed by the following lemma.
Lemma 4.1. For all w in Vs-L, (4.3)
IllE~wlll~ ~ &~lll~olll~, where 5~ is given by (2.7).
Proof. From (1.2) it follows that [1-7](A-'w,w)
<. (Bw, w) <. [1 + 7 ] ( A - ' w , w )
for all w E V.
J. ff, Bramble et al. / Preconditioned eigenvector/eigenvalue computation
For u
E
173
V~,
= ( ( i - A,A-')~,,(A- A,)~) ~ ((A- A,)~,~). Combining the above inequalities gives [1-"/][1 - A i / A s + t l ( ( A - Ai)u,u) <~ ( B ( A -
A,)u, (A - Ai)u)
~< [I + ' y ] ( ( A - Ai)u,u)
for all u E V~.
This implies that
][(I-QZsB(A
- AiI))u,u]i [ <~ 6illlu[[[ 2 for all u E V~.
Inequality (4.3) follows from the symmetry of (I - Q~sB(A - A i I ) ) the [., "]i inner product. The next lemma shows that 0 n+l
with respect to []
be controlled in terms of 0 n.
can
Lemma 4.2. Assume that
On <
AI
(1 + 7)A~
Define
A~(1 +.y)]2
Co= x+
i~
/
Then d i m V n + l = s and
(o-+'f .< Co(O-) ~.
(4.4)
Proof. By the triangle inequality,
on+l
=
IIQW -
Q~IIA<- I I Q W
- Q~IIA + IIQ2 -
Qslh (4.5)
= H Q W - Q211A ÷ 0~"
By construction, dim Vsn+l ~< dim Vsn. We now estimate
II(z - Qsn+' )Q~IIA " =
max I1(I vo~v:, [lvnl[A=l
Let v n be an arbitrary element of Vn with w = ~ = l ° q v 2 +l- We clearly have
- Q~
)v IIA ~ +ln
IIv~ -
IlvnllA =
-Qsn+I )~. n IIA.
1, v n = ~-]~=1 ajvr¢ and consider
WlIA~ 1f~sj ( v ~ j=l
- v j~ + ~j
A
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
174
By definition, S
8
aj(v'~ -
j=l
~n+,~' vj ) = B ~ eej(Av~ - ),yV)).nn j=l
It follows by (1.2) and the identity ( ( A - l - ( A j n) -, I ) v jn, w ) A = O
for a l l w E V s n
that
l[ s
o~j (v~ - W~+I)
A ~< (1 + ~) j=l
j=l
A
L O~J([--Qn)(vT--/~'A-Iv~')
=(1+7)
A
3=1 n A - I Qn S, n n s AjOljV). (Z-Q,)
=(1+7)
A"
j=l
Applying (3.11) gives
oj j=l ~
)
<~(I +7)A~II(Z-Q~)A-tQsIIA <~(1+7) A
0n
~
Finally, by the assumptions of the lemma, the last value is less than one, therefore, by lemma 3.2, we have dim Vs~+l = s and n _- I[(z _ on+, ) Q~+'_ GIlA $
"r~S
SllA ~<
(1 +7)
The lemma follows combining (4.5) and the last estimate.
on. []
We will need the following technical lemma. L e m m a 4.3. Let A be as in section 2 and suppose that
(on)2 ~ O~ for some positive parameter cr ~< 1. Then lemmas 3.5-3.8 apply with r~ = V¢. Moreover, the quantities YO = cr~j(n) are well defined, and
,~? ~< ),~(1 - ~)-~, AsA
AI <~ai(n) <~ 2 - c ~ ' Ask ~x ~< o;(n)
~< ---s-d 2
(4.6) (4.7) (4.8)
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
175
Proof Lemma 3.1 shows that for i = 1,..., s, 1 - ,Xi/,X? ~< a~ i+l
1i
i+1 + Ai "
A simple manipulation gives
1
1(
,X~-~. ~> ~// 1 - c ~
Ai+I - Ai) .
(4.9)
Inequality (4.6) immediately follows. The left inequalities in (4.7) and (4.8) are based on the simple estimate ,Xl ~< ~ll(n).
We next prove the right inequalities. Suppose that j is greater than i. Then Ai ~< Aj_ 1 < Aj ~< A~ and hence
Ai ~ Aj_ 1 __ Aj - 1Aj 1 - hi/h n 1 - hj-l/hj hj - "~j-I
Ai
Ai/h;~l
I1 -
,xj(,xj_~ + ,~j) <~ 2(hi - hj_~) "
(4.10)
If j < i ~< s then hi ) hj+l > h2 and (4.9) imply hi I1 -
,Xdh21
~<
hj+l
,X~+l/,~} ~ - l
.~
/~j/~j+l ,~j+~(1 - o,(,xj+~ - ,xj)/(,Xj+l
+ ,xj)) - ,~¢
~< Aj+I (Aj + Aj+l)
(4.11)
(2 - c~)(,xj+~ - hi)
The inequality (4.7) follows from (4.10) and (4.11). We have also shown (4.8).
[]
The next lemma provides a perturbation estimate which will be used to develop bounds for convergence of the preconditioned subspace method. Lemma 4.4. Assume that 0 n <<.A -t/2, 0n+x ~< A-1/2 and define c, = 1 +
2,' /Al )'-
62-
(a )2
6 3 = [(1 - C2(0=)2)(I - Ai/As+,)] -t, C4 = 1 + a2(n + 1) (0n+,)2 ' (hi) 2
Cs=
C6=
CX/--~3GI(I+ An(IA!+ 7 ) ) (1-Gl(Or~)2)-l/x(x/~4 + (C3)l/2(l+7)(,~nCl_l_hi~2) 1 - CI(0p) 2 A,
"
176
If
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
~2(On) 2 < 1 then /
n+l
[[[Q~ v~
~
/
n
(4.12)
l]]~ <~ 5i[[[Qs vi []1/,
where gi = (1 - C, (0r)2)-1/25i + on+los +
one6 •
P r o o f Let us first notice that lemma 4.3 guarantees that lemmas 3.5-3.8 can be applied with V = V~~ and with V = Vs~+l, the quantities ai(n), a~'(n), and a2(n + 1) are well defined and the assumption G2(0n) 2 < 1 of the lemma assures Cl (On) 2 < 1 because C2 ~> Cl by (4.8). Therefore, all constants above are well defined. We start the proof with two technical estimates. By (1.2), it follows that
lIB (Av? - &~v,~)IIA < (1 + ~)11¢ - ~,nA - ' %~ IIA" For i = l , . . . , s ,
(v? -
A ~ A - I v ? , v?)
A = O,
and hence [Iv? -
AinA-' v? IIa < ~?/A, ll<
- ~A-'v?IIA
= ~?/~,ll(z- AiA-')(I-
P¢)v?IIA
A;'/~,llz- ~A-~ Ilall(I -
P~)v?lja (4.13)
Consequently, by lemma 3.5
IIB(Av?- ~¢~?)IIA
(1 + ")')An~,/-C-~l07. ,Xl
(4.14)
Using the estimate above, we now prove that On controls O~+1. Indeed,
o?+' • ~< IIv,
" -,~n+, 1 -(v~,vdA v, IIA--I(~p,~dAIll~'? +'-P'v?ilA
Applying lemma 3.5, we can estimate the first factor by
i(vLVdAl-' =
llP~v,n -IIA1 ~< ( l - - C , (0?)2) -'/2.
(4.15)
We clearly have 6r~+l _ Piv? = (1 - Pc)v? + ~-?+1 _ v? = (I - Pi)v? - B ( A v ? - An. vn. ]
(4.16)
and thus (4.15), the triangle inequality, and (4.14) give
o:'+' (1- c',
(, +
(4.17)
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
177
We are now ready to prove the lemma. Define n
v~
Z~i = ?n+ 1 i --
Vn
( i , "Oi)A ,on+ 1 n+l
('oi
, "Oi)A
( V in, v i ) a ' wn+ 1
_1_ n
•
Then _1. n + l
tttQ,'o,
= EiQs u i .
£ n+l (vn.+l 111, = IItQ,~, f1,,~ ,'o,)at.
Applying the triangle inequality to ,us
+ q#
•
+ t ~ , ui
and lemma 4.1 with w = Qs£ uin gives
IQ,~,III,+
-
-
n+t ,'odAI-< 1 I('o~
~ By multiplying both sides by ('on+l k i , V i)A and using the estimate for all terms on the right but the second one, it follows that
II1~ < ,IIIQ~ ~ I/I, + IllQs (%
IQ~ %
+
We estimate the three terms in ±
n
I1,,,
, .A
- ~:~+'111,.
Q~+' s i
III,.
(4.18)
(4.18) separately. For the first, by (4.15)
]IlQ~%III~ ± n <, ( l _ C l ( 0 ~ ) 2 ) -,/21 I Q ± s v z~. i"
IIIQ~ ~,, tll,-- I('o~,'oi)A]
(4.19)
By (4.1), the norm part of the second term of (4.18) can be bounded by
_1_ n + l
= IIQ~ Q~
(~¢+' - ~+~)
IIa
.< 0~÷,11~ +~ - ~~+~ [[A"
(4.20)
By (4.16) and the fact that P~u'~ = vi, un+l
_ "c6i~n+l = ~*i-r,+l - vi - ( I -
P~)u'~ + B ( A @ -
A~ur~)
178
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
and the triangle inequality gives
I1~ +' - ~?+'IIA~II(/
- r .~, ~ - ~,
+ '
tla
+
I1(I _
Pi)unlla
+
IIB(A~p
--
~PuP) IIA
= I 1 ( I - Privy+Ilia + I 1 ( I - Pdv~lla + IIB(Av? - A~v~)lla I(v~+1, Vi)AI I(v~, vi)al
Multiplying both sides by [(V~+1, Vi)AI and using the estimates I(v in+l , Vi)AI ~ 1 and (4.15) for the last term give [[U~+I
~+l -
vn+l V
IIAI(~
p, ~vn+I
, 0al~
I[A
+ I1(-r - Pdv~lla + IIB(Av? - A~v?)lla ,/1 -- Cl (Op) 2 Applying lemma 3.5 with ~ / = V : +l and V = Vsn and (4.14) gives ilup+, _ ~
n+l IIAI( vn+l' Vi) A I F
< `/-624or+' + L1 +
(1 + ~ > : ' l
,/-07,
A,
J ,/1 - Cl(O~z)2
op.
(4.21)
By lemma 3.8 and (4.1), l n
2
(op) '~ .< [ 1 - c2(o~)~]-'IIQ{
(4.22)
Combining (4.20), (4.21), (4.17) and (4.22) gives {vn+l , V." ~ ~< C'O'~+l IIIQs.L,,n III, II ~s""~,=+' -- ~'n+l~ I I l ~, ,, ,, I,, ',AI
(4.23)
For the last term in (4.18), we use the fact that n + l - W~Z+l = (Ar~ - A i ) q s J_B u i n - Q~s B ( A - k J ) ( Q s q sl ~=i
- Pi)u•,
and hence by (4.1), (1.2), by lemmas 3.6, 3.4 and 3.7, and by (4.15),
IIIos±~n+' ~, w~+l
III,<.(I+~)I(vD,VOAI-I{Ap(I_A,/:~?)IIA-,vPlIA +ll(I- ~A-')(Qs- a)vPllA} ~< (1 +
a,
) /~nc 1 -4- ,,~i~v/~ onop
,/1 - c,(07)2
~ c 6 0 ~ 1 1IQ~ - ~ ~ Ill~
(4.24)
We used (4.22) for the last inequality above. Combining (4.18), (4.19), (4.23) and (4.24) verifies (4.12). This completes the proof of the lemma. []
ZH. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
179
Proof of theorem 2.1. Assume that (2.4) holds. Let/3 be defined by
/3 = (1 - "T)2()~1)2(1 -/~s//~s+l) 3 1941(,~,) 2
(4.25)
We use mathematical induction to show that for k ) 0, (/~1)2 (0k) 2 ~3 a2(,~s)2 '
(4.26)
II IQ~v, i k lll, ~ a~ll IG i v,0lll,,
(4.27)
and where 6i is given by (2.8). It follows from (2.4) and lemma 3.4 that (4.26) holds for k = 0. The inequality (4.27) holds trivially for k = 0. Assume that (4.26) and (4.27) hold for k = n, we show that they also hold for k = n + 1. The induction assumption (4.26) with k = n, (4.6), (4.7), 7 < 1, (4.25) and elementary manipulations give Co ~< 9.0062(As)2/(,~1) 2, (4.28) C1 <~ 1 + 00/(2-/3)2 ~< 1.00013.
Thus by (4.4) and (4.26) with k = n,
(on+l) 2 ~< 9.0062(),s)2/(),t )2 (On) 2 ~< 9.0062,5-200. By (4.8), (2 - 9.0062/3) 2. We also have
C1(0'~)2 ~<00(1 + 00/(2 - 00) 2) .< 0.00052, c2(0~)2 ~ 00(1 + 0 0 / ( 2 - 00)2)/(2 - 00)2 ~< 0.00013, (1 - C2(8n)2) -1 ~< 1.00013,
C3~<
1.00013 1 - Ai/d~s+l '
C4~< 1 +
9.0062()~s)200 ~< 1.00117, (2 - 9.006200)2(,\1) 2
0n+lc5 ~ 18.025 V/fl ~ , one6 ~< 3.0031 f~ 1
3
~ I/2 ),i
- ai/as+, ]
~"
(4.29)
180
J.H. Bramble et al. /Preconditioned eigenvector/eigenvalue computation
We used the inequality 1 - ki/As+l ~< A for the next to last inequality above and (4.6) in the last two inequalities above. Applying lemma 4.3 gives that (4.12) holds with ~'i = (1 + 1.00026V/-~)di, + 18.025V/-j
~< 5~ + 22.028
+ 3.0031
1-
~,la~+,
~--7
( / 3 )l/2Ai 1 - Ai/As+l A-7"
(4.30)
It then follows from (4.25) that
~< 5i +
(1 - ' 7 ) ( I - As~As+l) ( ~ s + l - As'~ t/2 Ai
\~s+~ ~ )
Z =
and hence (4.27) holds for k = n + 1. The bound (4.29) is not strong enough to imply (4.26) for k = n + 1. We get a better bound as follows. The above inequalities imply that
(,~,)~
(0n+i) 2 ~< 9.0062A-2¢7 ~< 0 . 0 0 4 6 4 - -
(,~s)2A 2'
(o~(n + 1)) 2 (,,~1) 2
C4(0n+l) 2 ~< 0.00117.
Applying lemma 3.4, (4.27) and the inequality analogous to (4.22) gives s 1.00117 s ~=~ II1~ (0"+')2 ~ ~ (°:'+')2 "< 1 : ~,-77V~+,~ LII'± .+,
i=l
i=l
~< 1 --1.00117 -~i/~-s+l Zs IIQ~ ± v~llA 0 2 i=l Now
Q±v°
(s
a)vo
(Qs_p~>O,
(4.31)
and hence applying lemmas 3.5 and 3.7 gives 1.00117
(0n+l) 2
s
1- ~l~s+, ~ ((1 + ~ ) c , + (1 + 11~)<(oo)~)(o°~)~ i=l
~<
1.02984
s
1 --~ZTL;+I~i=l (°°)2
In the above inequality, the constants G'I and C2 correspond to n = 0. The inequality (4.26) for k = n + 1 then follows from (2.4). This completes the induction part of the theorem.
J.H. Bramble et al. / Preconditioned eigenvector/eigenvaluecomputation
181
We now show that (4.26) and (4.27) imply (2.5) and (2.6). Lemmas 3.5 and 3.7, (4.22) and (4.31) give
_L n
2
--2n
l 0 2
I1(±- P#'~'II~. -< c,c~ItlQ, 'o, III, -< c,c,a, IIIQ, ~', t11, ~< C, C 3 ( ( t + ¥/fl)C, + (1 + 1/%/'~)C2(0°)2)'52n(0°) 2 t .03 ~< 1 - A i / A , + I
g .2n (O% 2 , v,,' •
A similar argument using lemma 3.6 proves (2.6). This completes the proof of the theorem. []
Proof of corollary 2.1. It follows from (3.6), (4.27) and lemma 3.8 that for any m > 0, 8 ( ore)2 ~ C3a2m E
II IQ,~-v~lll~. o
i=l
Let M = e -2 and choose ra so large that
s E ( 0?)2 ~
(1- 7)2(A1)4(1-/~s/)~s+l)
4
1999MA2(As) 4
i=l
Following the proof of the theorem, we prove by mathematical induction that, for
k>~m, (/~1)2 (0k) 2 ~ fl ]I/IA2(As)2 and
IlIQ~v~tll, ~ (~, + 4 ~ , - ~d)~-mlllQ~v?tll,. It then follows that
ll(z- p,)vtll 2
1.03
(~i q- E (~i --
~,)) 2 n - 2 m - -~,2m (o0) 2
1 - Ai/As+I
and 0~< 1 - A i / A ~ <
1.03 (5i q- E (~i -1 - Ai/As+l
This completes the proof of the corollary.
~))
2 n - 2 m - -~i2m
(0°) 2. []
182
5.
J.H. Bramble et al. /Preconditioned eigenvector/eigenvalue computation
Numerical experiments
The results of numerical experiments involving the preconditioned subspace iteration technique are given in this section. We first give results for the case when the eigenvalues are well separated. We also consider an example where a group of eigenvalues are distinct but much closer together. For comparison, we include the results of numerical experiments involving a block preconditioned method analyzed in [21-23]. All of our experiments include subspaces of multiplicity greater than one. The model problem which we shall consider is the one-dimensional eigenvalue problem Oq2U
u
Oz---~ - A u
on(O,l), (5.1)
u(O) = u(1)
O u "0) =
and
~xx(
Ou
Oxx (1).
The smallest eigenvalue for (5.1) is 1 and its eigenspace is equal to the space of constants. The larger eigenvalues are given by Aj = 1 + 47r2j 2 for j = 1,2, . . . . The corresponding eigenspace has dimension 2 and is spanned by the vectors ~ + = cos(27rjx),
~'j- = sin(2rjx).
We approximate (5.1) by using a spectral approximation. Let n be given and define h = (2n) -1 and xi = ih. V is the space of 2n-dimensional vectors with inner product 2n
(v, w) = h ~
viwi.
i=1
The ith component of a vector v E V corresponds to the nodal value at xi. Consider the functions (5.2) { 0 i } = { 1 , v + , V l , . . . , v- *,~ , v- -,~ , v- +,~+l } , and let 1~ be the co~esponding linear span. Given a vector v E V there exists a unique function ~ E V such that
In fact, the transformation from v C V to the coefficients of ~ in the basis (5.2) can be rapidly computed by use of the fast Fourier transform. The operator A : V -+ V is defined by
= fol(
dx.
The eigenvalues of A coincide with those of (5.1). In addition, the eigenvectors of A are of the form ( 1 , . . . , 1 ) , (v+)i = ~'+(xi) and (v-~)i = ~j-(x/).
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
t
t
I
183
1
10 2
100
10-2
10-4-
10-6-
I
2
I
1
4
6
1
8
Figure 1. Eigenvalue error: the case of well separated eigenvalues.
It is well known that different numerical approximations can often be used as preconditioners for each other. Accordingly, we consider the finite difference approximation ( A V ) i ~- Vi nt- h - 2 ( 2 v i -- V i - I -- V i + I ) .
(5.3)
We interpret i - 1 and i + 1 modulo n above. The use of .,4- l as a preconditioner for A would be a bit too trivial since A and A share the same eigenvectors. Instead, we define /3 by applying an additive overlapping domain decomposition preconditioner for (5.3) [12, 20]. This is a more realistic choice since the resulting preconditioner t3 does not commute with A. For all of the numerical results reported, we use eight overlapping subdomains (of size 1/4) and a coarse grid of size H = 1/8. For this problem, the additive preconditioner was scaled so that (1.1) holds with 7 = 2/3 (independently of n). Such a value of 3' is not unusual in many applications. For example, a well designed multigrid V-cycle gives rise to 7 ~ 1/10. In all of our runs, we took n = 256. The results for other n were qualitatively the same. Figure 1 illustrates the eigenvalue convergence for the eigenvalues A o = 1, A1 = 40.48, A2 = 158.9, A3 = 356.3, ~4 =
632.7,
A5=988.0.
184
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
I
I
I
I
I .........
I 0
I 2
I 4
I 6
I 8
I0 o
10 -2
10-4-
10-6-
Fig-ure 2. Eigenvector error: the case of well separated eigenvalues.
We applied the above algorithm with an 11-dimensional subspace. The initial subspace was chosen to be the space spanned by 11 vectors with entries consisting of random numbers uniformly distributed in the interval [0, 1]. No attempt was made to generate a sufficiently accurate starting subspace as required by the theory. Nevertheless, the method converged extremely well. Note the faster rate of convergence for the smaller eigenvalues. Figure 1 reports the actual error. Thus after eight iterations the error in the fifth eigenvalue was less than 1.5 percent. We did not report the eigenvalue error for the initial subspace. The reason for this is that these errors are very large and depend on the largest eigenvalue of the system. For the reported example (n = 256), the eigenvalue error in the initial subspace was on the order of 106. This is to be expected since the initial subspace is chosen at random and is dominated by high frequency components. It is somewhat surprising that the method reduces these errors so rapidly. The eigenvector convergence behavior is illustrated in figure 2. Here we report the eigenvector error as measured by
ll ll
'
where ( ~ is the (., .) orthogonal projector onto the space V/n spanned by the approximate eigenvectors. The first eigenvalue is of multiplicity zero and hence Von is one-dimensional. The remaining eigenvalues are of multiplicity 2 and V/~ involves the span of subsequent pairs of approximate eigenvectors. For the multiplicity 2 case, we report the larger of the two values of e~. As predicted by the theory, more rapid convergence is observed for the eigenvectors associated with the smaller eigenvalues. The estimates developed earlier in this manuscript deteriorate in the case of eigenvalues with little separation. The next example suggests that the preconditioned subspace method still works well even when the eigenvalues cluster. For this example, we no longer use (5.1) but still preserve the same eigenspaces. We simply move the second and fourth eigenvalues so that they are significantly closer to the
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
I
I
I
I
2
4
6
8
185
102
A I0 °
10-2
10-4.
10-6-
Figure 3. Eigenvalue error: the case of clustered eigenvalues.
third. Specifically, we apply the preconditioned subspace method to the problem with eigenvalues Ao= 1, Al = 157.7, A2 =
158.9,
A3 = 160.9, ),4 = 632.7, A5 =
988.0.
For this problem, there is a six-dimensional eigenspace with eigenvalues which are separated by less than two percent. We used the same preconditioner as in the first example but in this case (since A has changed) we have 6 ~< 0.7. Figure 3 illustrates that we still get rapid eigenvalue convergence. In fact, there is not much difference in the performance when compared to the well separated case in figure 1. Figure 4 gives the eigenvector convergence measured the same way as reported in the first example (figure 2). Note that the eigenspaces corresponding to the three clustered eigenvalues still converge quite rapidly. As our final example, we consider the DK block iterative method studied in [21-23]. We choose this method because it was the only other method available with a fully developed convergence theory. This method uses the shift corresponding to the largest eigenvalue in the subspace for all vectors in the space. Thus, Vsn+l = s p a n { ~ + l , . . . ,
~ + ' },
(5.4)
186
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
I
1
I
I
I
100
10-2
10-4-
10-oI
1
0
I
2
I
4
}
6
8
Figure 4. Eigenvector error: the case of clustered eigenvalues. I
I
..........
I
I
A5 102
As ~ ~ - - - ~ 4 100
10-2
(,h, ,~2)
............
A0
10-4.
10-6-
I
6
I
12
1
t8
1
24
Figure 5. Eigenvalue error: DK method, clustered eigenvalues.
where n n ~ + 1 = v.~ - B ( A v ~ - AsV ~)
for i = 1,... ,s.
(5.5)
The above scheme is somewhat easier to implement than that studied in this paper. This is because the Ritz eigenvectors {v~} in (5.5) can be replaced by any spanning set for the subspace Vsn, e.g., { ~ } . It is shown in [21-23] that convergence is achieved for the eigenspace corresponding to the largest eigenvector in the subspace Vs. In contrast to the theory of the present paper, the convergence estimates for the DK method do not depend on the clustering of eigenvalues.
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
I
1
1
1
187
I
100
10-2 vo 10-4-
10-8. 1
0
[
6
I
12
I
18
1
24
Figure 6. Eigenvectorerror: DK method, clustered eigenvalues. To compare the method of the paper with the DK method, we ran the DK method for the clustered eigenvalue example. The eigenvalue convergence is illustrated in figure 5. The figure shows that the DK method does exhibit a rate of convergence for the largest eigenvalue As. However, it should be noted that the accuracy achieved by the DK method for this eigenvalue using 18 iterations was not as good as that by the method of this paper using 8 iterations. Note that the approximations for the lower eigenvalues are not improving. This is what one would expect as the theory does not suggest any convergence for these eigenvalues. The initial steps of the DK method show good convergence on the smaller eigenvalues and eigenvectors. The eigenvector convergence is illustrated in figure 6. Atthough a uniform convergence rate is achieved for the eigenspace corresponding to the largest eigenvector, the method stops converging the remaining eigenspaces. The method of this paper gives rise to a faster convergent approximation to the eigenspace corresponding to the largest eigenvalue A5 while converging the smaller eigenspaces at much better rates.
References
[1] O. Axelsson, A generalized conjugate gradient, least squares method, Numer. Math. 51 (1987) 209-228. [2] G. Birkhoff and A. Schoenstadt (eds.), Elliptic Problem Solvers H (Academic Press, New York, 1984). [3] P.E. BjCrstadand O. B. Widlund,Solvingelliptic problems on regions partitioned into substructures, in: Elliptic Problem Solvers II, eds. G. Birkhoffand A. Schoenstadt(AcademicPress, New York, 1984) pp. 245-256. [4] L H. Bramble, Muhigrid Methods, Pitman Research Notes in Mathematics Series (LongmanSci. Tech., London). Copublished with Wiley,New York, 1993. [5] J.H. Bramble and J. E. Pasciak, New convergenceestimatesfor multigridalgorithms,Math. Comp. 49 (1987) 311-329. [6] J.H. Bramble and J. E. Pasciak, New estimates for multigrid algorithms including the V-cycle, Math. Comp. 60 (1993) 447--471.
188
J.H. Bramble et aI. /Preconditioned eigenvector/eigenvalue computation
[7] J.H. Bramble, J. E. Pasciak and A. H. Schatz, An iterative method for elliptic problems on regions partitioned into substructures, Math. Comp. 46 (1986) 361-369. [8] J. H. Bramble, J. E. Pasciak and A. H. Schatz, The construction of preconditioners tbr elliptic problems by substructuring I, Math. Comp. 47 (1986) 103-134. [9] J.H. Bramble, J. E. Pasciak and A. H. Schatz, The construction of preconditioners for elliptic problems by substructuring II, Math. Comp. 49 (1987) 1-16. [10] J. H. Bramble, J. E. Pasciak and A. H. Schatz, The construction of preconditioners for elliptic problems by substructuring III, Math. Comp. 51 (1988) 415-430. [11] J. H. Bramble, J. E. Pasciak and A. H. Schatz, The construction of preconditioners for elliptic problems by substructuring IV, Math. Comp. 53 (1989) 1-24. [12] J. H. Bramble, J, E. Pasciak, J. Wang and J. Xu, Convergence estimates for product iterative methods with applications to domain decomposition, Math. Comp. 57 (1991) 1-21. [13] J.H. Bramble, J. E. Pasciak and J. Xu, Parallel multilevel preconditioners, Math. Comp. 55 (1990) 1-22. [ 14] Z. Cao, Generalized Rayleigh quotient matrix and bloc algorithm for solving large sparse symmetric generalized eigenvalue problems, Numerical Math. J. Chinese Univ. 5 (1983) 342-348. [15] T. E Chan, R. Glowinski, J. Periaux and O. B. Widlund (eds.), Domain Decomposition Methods (SIAM, Philadelphia, PA, 1989). [16] T. E Chan, R. Glowinski, J. Periaux and O. B. Widlund (eds.), 3rd bzt. Symp. on Domain Decomposition Methods for Partial Differential Equations (SIAM, Philadelphia, PA, 1990). [17] N. Chetty, M. Weinert, T. S. Rahman and J. W. Davenport, Vacancies and impurities in aluminum and magnesium, Phys. Rev. B (1995) 6313-6326. [18] J. Davidson, The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real symmetric matrices, J. Comput. Phys. 17 (1975) 87-94. [19] J. Davidson, Matrix eigenvector methods, in: Methods in Computational Molecular Physics (Reidel, Boston, 1983) pp. 95-113. [20] M. Dryja and O. Widlund, An additive variant of the Schwarz alternating method for the case of many subregions, Technical Report 339, Courant Institute of Mathematical Sciences (1987). [21] E.G. D'yakonov, Optbnization in Solving Elliptic Problems (CRC, Boca Raton, 1995). [22] E.G. D'yakonov and A. V. Knyazev, Group iterative method for finding lower-order eigenvalues, Moscow Univ. Comput. Math. Cybern. 2 (t982) 32-40. [23] E.G. D'yakonov and A. V. Knyazev, On an iterative method for finding lower eigenvalues, Russian J. Numer. Anal. Math. Modelling 7 (1992) 473--486. [24] E.G. D'yakonov and M. Yu. Orekhov, Minimization of the computational labor in determining the first eigenvalues of differential operators, Math. Notes 27 (1980) 382-391. [25] E.G. D'yakonov, Iteration methods in eigenvalue problems, Math. Notes 34 (1983) 945-953. [26] R. Glowinski, G. H. Golub, G. A. Meurant and J. Periaux (eds.), 1st hzt. Symp. on Domain Decomposition Methods for Partial Differential Equations (SIAM, Philadelphia, PA, 1988) pp. 144-172. [27] R. Glowinski, Y. A. Kuznetzov, G. Meurant, J. Periaux and O. B. Widlund (eds.), 4th bzt. Syrup. on Domain Decomposition Methods for Partial Differential Equations (SIAM, Philadelphia, PA, 1991) pp. 263-289. [28] S.K. Godunov, V. V. Ogneva and G. P. Prokopov, On the convergence of the modified steepest descent method in application to eigenvalue problems, Trans. Amer. Math. Soc. 2 (1976) 105. [29] V.P. II'in, Numerical Methods of Solving Electrophysical Problems (Nauka, Moscow, 1985) (in Russian). [30] T. Kato, Perturbation Theory for Linear Operators (Springer, New York, 1976). [31] A.V. Knyazev, Convergence rate estimates for iterative methods for a mesh symmetric eigenvalue problem, Russian J. Numer. Anal. Math. Modelling 2 (1987) 371-396. [32] A.V. Knyazev, Computation of Eigenvalues and Eigenvectors for Mesh Problems: The Algorithms and Error Estimates (Dept. Numer. Math., USSR Acad. Sci., Moscow, 1986) (in Russian).
J.H. Bramble et al. / Preconditioned eigenvector/eigenvalue computation
[33]
[34] [35]
[36]
[37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51]
[52]
189
A. V. Knyazev, A preconditioned conjugate gradient method for eigenvatue problems and its implementation in a subspace, in: Eigenwertaufgaben b~ Natur- und hzgenieurwissenschaften und ihre Numerische Behandlung, Oberwolfach (1990), Int. Ser. Numer. Math., Vol. 96 (Birkh~iuser, Basel, 1991)pp. 143-154. A.V. Knyazev, New estimates for Ritz vectors, CIMS NYU 677 (New York, 1994). Also Math. Comp., to appear. A.V. Knyazev and A. L. Skorokhodov, The preconditioned gradient-type iterative methods in a subspace for partial generalized symmetric eigenvalue problem, SIAM J. Numer. Anal. 31 (1994) 1226. N. Kosugi, Modification of the Liu-Davidson method for obtaining one or simultaneously several eigensolutions of a large real symmetric; the preconditioned gradient-type iterative methods in a subspace for partial generalized symmetric eigenvalue problem, J. Comput. Phys. 55 (I984) 426--436. D.E. Longsine and S. E McCormick, Simultaneous Raileigh-quotient minimization methods for A z = ABz, Linear Algebra Appl. 34 (1980) 195-234. S. E McCormick and T. Noe, Simultaneous iteration for the matrix eigenvalue problem, Linear Algebra Appl. 16 (1977) 43-56. R. B. Morgan, Davidson's method and preconditioning for generalized eigenvalue problems, J. Comput. Phys. 89 (1990) 241-245. R.B. Morgan and D. S. Scott, Generalizations of Davidson's method for computing eigenvalues of sparse symmetric matrices, SIAM J. Sci. Statist. Comput. 7 (1986) 817-825. R.B. Morgan and D. S. Scott, Preconditioning the Lanczos algorithm for sparse symmetric eigenvalue problems, SIAM J. Sci. Comput. 14 (1993) 585-593. C.W. Murray, S. C. Racine and E. R. Davidson, Improved algorithms for the lowest few eigenvalues and associated eigenvectors of large matrices, J. Comput. Phys. 103 (1992) 382-389. B.N. Parlett, The Symmetric Eigenvalue Problem (Prentice-Hall, Englewood Cliffs, NJ, 1980). W.V. Petryshyn, On the eigenvalue problem Tu - ASu = 0 with unbounded and non-symmetric operators T and 5', Philos. Trans. R. Soc. Math. Phys. Sci. 262 (1968) 413-458. A. Ruhe, SOR-methods for the eigenvalue problem with large sparse matrices, Math, Comp. 28 (1974) 695-710. A. Ruhe, Iterative eigenvalue algorithm based on convergent splittings, J. Comput. Phys. 19 (1975) 110-I 20. Y. Saad, Numerical Methods for Large Eigenvatue Problems (Halsted Press, New York, 1992). M. Sadkane, Block-Arnotdi and Davidson methods for unsymmetric large eigenvalue problems, Numer. Math. 64 (1993) 195-211. B. A. Samokish, The steepest descent method for an eigenvalue problem with semi-bounded operators, Izv. Vyssh. Uchebn. Zaved. Mat. 5 (1958) 105-114 (in Russian). D.S. Scott, Solving sparse symmetric generalized eigenvalue problems without factorization, SIAM J. Numer. Anal. 18 (1981) 102-110. A. Stathopoulos, Y. Saad and C. E Fischer, Robust preconditioning of large, sparse, symmetric eigenvalue problems, J. Comput. Appl. Math. (to appear). Also report TR-93-093 of AHPCRC, University of Minneapolis (1993). P. Vassilevski, Hybrid V-cycle algebraic multilevel preconditioners, Preprint, Bulgarian Academy Sciences, Sofia, Bulgaria (t987).