Circuits Syst Signal Process DOI 10.1007/s00034-015-0221-3 SHORT PAPER
Off-Grid DOA Estimation Via Real-Valued Sparse Bayesian Method in Compressed Sensing Weijian Si1 · Xinggen Qu1 · Zhiyu Qu1 · Pinjiao Zhao1
Received: 3 December 2014 / Revised: 8 December 2015 / Accepted: 9 December 2015 © Springer Science+Business Media New York 2016
Abstract A novel real-valued sparse Bayesian method for the off-grid direction-ofarrival (DOA) estimation is proposed in compressed sensing (CS). The off-grid model is reformulated by the second-order Taylor expansion to reduce modeling error caused by mismatch. To apply the Bayesian perspective in CS conveniently, complex data are addressed to yield a real-valued problem by utilizing a unitary transformation. By assuming that sources among snapshots are independent and share the same sparse prior, joint sparsity is exploited for DOA estimation. Specifically, a full posterior density function can be provided in the Bayesian framework. The convergence rate and convergence stability of the proposed method can be guaranteed in the iterative procedure. Simulation results show superior performance of the proposed method as compared with existing methods. Keywords Compressed sensing · Direction-of-arrival (DOA) estimation · Off-grid model · Real-valued sparse Bayesian
B
Zhiyu Qu
[email protected] Weijian Si
[email protected] Xinggen Qu
[email protected] Pinjiao Zhao
[email protected]
1
Department of Information and Communication Engineering, Harbin Engineering University, Harbin, People’s Republic of China
Circuits Syst Signal Process
1 Introduction Direction-of-arrival (DOA) estimation is a topic of great interest in the signal processing. It has been widely applied in many fields ranging from radar and communication systems to medical imaging [1,6]. Many conventional DOA estimation methods are proposed in the last decades, which can be mainly divided into two categories: parametric methods and nonparametric (spectral-based) methods. The parametric methods including maximum likelihood (ML) [19] and stochastic ML (SML) [10] need high computational complexity for precise estimation and good initialization for global convergence. The nonparametric methods such as multiple signal classification (MUSIC) [18] and estimation of signal parameter via rotational invariance technique (ESPRIT) [17] exploit the locations of highest peaks of spatial spectrum for DOA estimation. However, these methods result in serious performance degradation when signal-tonoise ratio (SNR) is low, number of snapshots is small or sources are coherent, which is available with ease in practical situations. To avoid such drawbacks, DOA estimation involving compressed sensing (CS) [2–4] has been attracting tremendous interest, due to the development of methods based on CS [15,16]. In [13], Malioutov et al. used the singular value decomposition (SVD) of the data matrix and proposed the l1 -SVD method, which is firstly formulated as a sparse signal reconstruction (SSR) problem, for DOA estimation. By combining subspace-based methods with CS, compressive MUSIC (CS-MUSIC) [9] and subspace-augmented MUSIC (SA-MUSIC) [11] were proposed to obtain better estimation performance than MUSIC. Liu and Zhou [12] formulated a unified framework and proposed sparse Bayesian array calibration (SBAC) to realize array calibration and DOA estimation by exploiting the sparsity of sources. Despite the positive and significant improvements of CS-based methods, the performance is limited in practical situations where true DOAs are not on the sampling grid set. This implies unavoidable increase in modeling error caused by mismatch even if the recursive grid refinement is used to alleviate the limitation. Furthermore, a too dense sampling grid results in a highly coherent matrix which is against so called the restricted isometry property (RIP) [5,21] that requires high incoherence of all columns of the matrix. Zhu et al. [23] proposed the sparse total least square (STLS) to address the off-grid DOA estimation, in which true DOAs no longer belong into the sampling grid set. The matrix perturbed is assumed to be Gaussian. In [22], Yang et al. proposed the off-grid sparse Bayesian inference based on the singular value decomposition (OGSBI-SVD) that is suitable for both single measurement vector (SMV) and multiple measurement vectors (MMV). Zhang et al. [24] proposed the off-grid block sparsity Bayesian learning (OGBSBL) for off-grid DOA estimation, which can work without the knowledge of the number of sources. In this paper, a novel DOA estimation method is proposed based on the off-grid model from a real-valued Bayesian perspective in CS. Using the second-order Taylor expansion, the off-grid model is reformulated to reduce the modeling error. The grid error, which is defined as the distance from the true direction to the nearest grid, is assumed to be uniformly distributed. Complex-valued problem is transformed into a real-valued one by using a unitary transformation, which promotes applying the Bayesian perspective in CS. Joint sparsity is exploited for DOA estimation by assuming
Circuits Syst Signal Process
the same sparse prior for sources among snapshots. Specifically, a full posterior density function is provided in the Bayesian framework. We refer the proposed method as realvalued modified Aitken expectation conditional maximization (R-MA-ECM), which is an iterative method. The iteration of R-MA-ECM is performed from ECM to MA for the convergence rate and from MA to ECM for the convergence stability until the stopping criterion is satisfied. Simulation results show that the proposed method has superior performance with comparisons to existing methods. Note that OGSBI-SVD in [22] use sparse Bayesian inference for off-grid DOA estimation. It is evident that the proposed method is different from OGSBI-SVD. The proposed method utilizes MA and ECM to perform off-grid DOA estimation in the real field, while OGSBI-SVD uses expectation–maximization (EM) in the complex field. Moreover, the off-grid model of this paper is more accurate than that of [22] and the hyperparameter is assumed to be Chi-square distribution rather than gamma distribution as in [22]. The remainder of the paper is organized as follows. In Sect. 2, we formulate the DOA estimation problem, in which the off-grid model is reformulated by the secondorder Taylor expansion and a unitary transformation is exploited to yield a real-valued problem. A real-valued Bayesian method is described in detail in Sect. 3. The superior performance of the proposed method is illustrated in Sect. 4, and conclusions are provided in Sect. 5.
2 Problem Formulation Consider the uniform linear array (ULA) consisting of M omnidirectional sensors with interelement spacing d impinged on K narrow-band sources sk (t), k = 1, 2 . . . K from far field. The array received superimposition of sources at the tth snapshot can be given by x(t) =
K
a(θk )sk (t) + n(t) t = 1, 2 . . . T
(1)
k=1
where T is the number of snapshots, θk is the unknown direction of the kth source, a(θk ) is a column vector with M complex entries containing the delay information of the kth source which is called the steering vector, and n(t) ∈ C M×1 denotes the noise term. For simplicity, the origin of sensor array is set at the middle of the ULA shown in Fig. 1. The goal is to estimate the unknown directions θ = [θ1 θ2 . . . θ K ]T from the given x(t) where [·]T denotes the transpose. Let θ˜ = [θ˜1 , θ˜2 . . . θ˜L ]T be a fine sampling grid which covers the entire DOA space where L(L K ) denotes the number of the sampling grid. To reduce the modeling error, we reformulate the off-grid model which is closely related to the ongrid model given in (1). Let θ˜ satisfy the uniform distribution so that the grid interval / {θ˜1 , θ˜2 . . . θ˜L } for some k ∈ {1, 2 . . . K } and τ = θ˜k+1 − θ˜k ∝ L −1 . Denote θk ∈ ˜ ˜ ˜ ˜ that θlk ∈ {θ1 , θ2 . . . θ L } is the nearest grid to the true direction θk . By exploiting the second-order Taylor expansion, the steering vector a(θk ) can be approximated as the following form
Circuits Syst Signal Process
Fig. 1 Geometry of the ULA. a M is odd. b M is even
a(θk ) ≈ a(θ˜lk ) + b(θ˜lk )(θk − θ˜lk ) + c(θ˜lk )(θk − θ˜lk )2
(2)
The mth entries of b(θ˜lk ) and c(θ˜lk ) are respectively expressed as [− j 2πλ d (m ˜ ˜ and 1 [− j 2π d m − M+1 cos θ˜lk ]2 ·a(θ˜lk ) with the wavelength − M+1 2 ) cos θlk ]·a(θlk )√ 2 λ 2 of source λ and j = −1. Assume that θ˜l1 , θ˜l2 . . . θ˜l K are respectively the nearest grids to the true directions θ1 , θ2 . . . θ K so that we have
gl (t) = slk (t), εl = θk − θ˜lk ; i f l = lk (k = 1, 2, . . . K ) gl (t) = 0,
εl = 0;
elsewhere
(3)
where εl ∈ [− τ2 , τ2 ] is the grid error. By imposing the approximation error on the noise, the received superimposition of sources can be written into the following sparse form (4) x(t) = A + Bdiag(ε) + Cdiag 2 (ε) g(t) + e(t) t = 1, 2, . . . T where ε = [ε1 ε2 . . . ε L ]T , B = [b(θ˜1 ) b(θ˜2 ) . . . b(θ˜L )], C = [c(θ˜1 ) c(θ˜2 ) . . . c(θ˜L )] and A = [a(θ˜1 ) a(θ˜2 ) . . . a(θ˜L )] is the array manifold matrix corresponding to all the potential directions which is defined as an overcomplete dictionary in CS. In addition, diag(ε) is a diagonal matrix with ε being the diagonal elements. Obviously, the on-grid model is a special case by simply setting ε = 0 in
Circuits Syst Signal Process
(4). g(t) = [g1 (t) g2 (t) . . . g L (t)]T is a K-sparse vector since it has K nonzero elements in the L elements and K is referred to as sparsity. Hence, the matrix T G = [g(1) g(2) . . . g(T )] has K nonzero rows, i.e., the row K-sparse, since {g(t)}t=1 share the same support based on joint sparsity. To apply the Bayesian perspective in CS conveniently, a unitary transformation is exploited to transform (4) into a real-valued problem. As a matter fact, although the dimensions of vectors or matrices are expanded, the computational complexity can be reduced since complex multiplication costs four times as much as that of real multiplication [7]. Following [7], a unitary transformation matrix V M is defined as
VM
⎧
IM JM ⎪ ⎪ 2 2 ⎪ ⎪ √1 ⎪ ⎪ 2 ⎪ jJ M − jI M ⎪ ⎪ 2 2 ⎨ ⎡ I 0 J M−1 M−1 M−1 = 2 2 ×1 2 ⎪ √ ⎪ ⎢ ⎪ √1 ⎢ ⎪ 0 2 0 M−1 M−1 ⎪ 1× 2 ⎪ ⎪ 2 ⎣ 1× 2 ⎪ ⎪ ⎩ jJ M−1 0 M−1 ×1 − jI M−1 2
where I M denotes an 2
M 2
×
M 2
2
if M is even ⎤
(5)
⎥ ⎥ if M is odd ⎦
2
identity matrix and J M denotes an 2
M 2
×
M 2
exchange
H = VH V matrix with ones on its anti-diagonal and zeros elsewhere. Since V M V M M M = I M , V M is a unitary matrix. Then, by using the unitary transformation V M , (4) can be
written in a matrix form V M X = V M β(ε)G + V M E = β(ε)G + V M E
(6)
where β(ε) = A + Bdiag(ε) + Cdiag 2 (ε) ∈ C M×L and β(ε) ∈ R M×L is a real matrix. Due to (6), V M X can be partitioned into real and imaginary parts, which can be further rewritten as V M X = [Re{V M X} Im{V M X}] = β(ε)G + V M E
(7)
where Re{·} and Im{·} denote real part and imaginary part, respectively, (·) denotes a real matrix [Re{·} Im{·}] and G = [Re{G} Im{G}] = [ˆg(1) gˆ (2) . . . gˆ (2T )]. A real-valued problem is yielded since all matrices in (7) are all real. To find the true directions, the support of the sparse sources and ε need be jointly estimated from the matrix Y which is given by Y = [y(1) y(2) . . . y(2T )] = Φβ(ε)G + ΦV M E = Ψ G + Eˆ
(8)
ˆ [ˆe(1), e(2), ˆ . . . eˆ (2T )] and with the sensing matrix Ψ = Φβ(ε), the noise matrix E= the N × M measurement matrix Φ with N < M where N is the number of nonadaptive linear projection measurements.
Circuits Syst Signal Process
3 DOA Estimation In this section, the proposed method is elaborated based on the above off-grid model in the real field. By exploiting MA and ECM to estimate DOA, the proposed method can guarantee both the convergence rate and the convergence stability. Moreover, the complexity and the convergence and its uniqueness of the proposed method are discussed in this section. 3.1 Real-Valued Sparse Bayesian Model In this subsection, the real-valued sparse model is developed from Bayesian perspective in the CS scenario. Without loss of generality, the model is derived in the multiply snapshots case. By applying the central limit theorem, the components eˆ (t), t = 1, 2, . . . 2T, of Eˆ are independently circular Gaussian noise with zero mean and covariance α0 I N , i.e., 2T ˆ 0 = N eˆ (t)/0, α0−1 I N p E/α
(9)
t=1
where α0 = σ −2 denotes the precision (inverse variance) of the noise. For the convenience of analysis, the product of Chi-square distribution and exponential function is introduced on the hyperparameter α0 p(α0 ; q) = e
α0 2
· χ 2 (α0 / q)
(10) q
q
α0
where q is the degrees of freedom, χ 2 (α0 / q) = [Γ ( q2 )]−1 · 2− 2 · α0 2 −1 e− 2 and Γ ( q2 ) is the gamma function. We set q → 0 to make the hyperprior non-informative. Due to (9), we have 2T p Y/α0 , ε, G = N y(t)/Ψ gˆ (t), α0−1 I N
(11)
t=1
In a Bayesian formulation, a sparse prior is considered on the matrix G of interest to avoid the over-fitting [20]. With the assumption that the sources among snapshots are independent, we employ 2Tthe two-stage hierarchical model for the hierarchical prior g(t)/α) p(α; ν)dα where ν > 0. Firstly, we define a of G, p(G, ν) = t=1 p(ˆ zero-mean Gaussian prior 2T N gˆ (t)/0, Λ p G/α =
(12)
t=1
with α = [α1 , α2 , . . . α L ]T and Λ = diag(α). To complete the specification of the hierarchical prior, we must define the hyperprior over α that is conjugate to the
Circuits Syst Signal Process
Gaussian likelihood function. Hence, the Chi-square hyperprior is considered over α with the degrees of freedom ν = L so that the associated Bayesian inference can be performed in closed form p(α; ν) =
L
χ 2 (αl /ν)
(13)
l=1
The prior defined by (12) and (13) implies that all columns of G share the same hierarchical prior which is strongly peaked at the origin. As a result, the hierarchical prior is a sparse prior that makes the most rows of G concentrating on zeros. For the grid error, since there exists usually no more prior information of ε, ε is assumed to be uniform distribution τ τ L ε= U − , 2 2
(14)
so that the prior is non-informative. 3.2 R-MA-ECM Having defined the prior on all the unknowns, we propose the Bayesian method for DOA estimation in real field by computing the posterior p G, α0 , α, ε/Y p Y/G, α0 , α, ε · p G, α0 , α, ε p G, α0 , α, ε/ Y = p(Y)
(15)
Since the analytical expression cannot be given [20], the posterior p(G, α0 , α, ε/ Y) in (15) cannot be directly computed. Instead, the posterior is decomposed as p G, α0 , α, ε/ Y = p G/α0 , α, ε, Y · p(α0 , α, ε/ Y)
(16)
Note that the parameter G is analyzed by the posterior p G/α0 , α, ε, Y of G assuming that all parameters except G are held the fixed data, and the hyperparameters α0 , α and ε are estimated by maximizing p (α0 , α, ε/ Y) according to the maximum a posterior (MAP) estimate. Fortunately, with the assumption that the hyperparameters α0 , α and ε are known and the matrix Y are given, the posterior distribution for G is a multivariate Gaussian distribution with the mean and covariance [8] μ(t) = α0 ΣΨ H y(t) t = 1, 2, . . . 2T Σ = (α0 Ψ H Ψ + Λ−1 )−1
(17) (18)
where (·) H denotes the conjugate transpose. Since Σ and μ(t), t = 1, 2, . . . 2T , are a function of α0 , α and ε, hyperparameters need be also estimated in an iterative procedure. Then, we have p(α0 , α, ε/Y) ∝ p(Y/α0 , α, ε) p(α0 ) p(α) p(ε) since Y
Circuits Syst Signal Process
is independent of the hyperparameters. Due to p(Y/ α0 , α, ε) = p G/ α dG, we have
p Y/α0 , ε, G
p(α0 , α, ε/ Y) ∝ p(Y/α0 , ε, G) p(G/ α) p(α0 ) p(α) p(ε)
(19)
where the distributions on the right hand side are given by (10)–(14), respectively. In ECM, we make full use of the simplicity of the conditional maximization (CM) by replacing a complicated M-step of EM with several CM-steps of ECM. Each of these CM-steps maximizes the conditional expectation. A CM-step may be in closed form or it may itself require iteration, but they are faster and more stable than the corresponding M-step since the conditional maximizations are calculated over small dimension space, especially when iteration is required [14]. In this paper, we decompose the M-step into (2L+1) CM-steps for estimating α0 , α and ε. In the kth iteration, α0 is solved by maximizing E{log p(Y/α0 , ε, G) p(G/ α) p(α0 ) p(α) p(ε)} with respect to α0 , which is (k) defined as α0 , where E{·} denotes the expectation with respect to the posterior distri(k−1) (k−1) (k−1) (k−1) , . . . , αL , ε1 , . . . , εL . All other bution of G in the scenario with given α1 2L conditional maximizations are performed by analogy to obtain the hyperparame(k) (k) (k) (k) ters α1 , . . . , α L , ε1 , . . . , ε L . Based on the above analysis, a detailed derivation (k) process is given for estimating the hyperparameters. For α0 , by (19) and considering the terms with respect to α0 , we equivalently maximize E log p Y/α0(k−1) , ε, G p(α0(k−1) ) q (k−1) = 2T N + − 1 log α0 2 2T (k−1) H −E [y(t) − Ψ gˆ (t)] α0 I N [y(t) − Ψ gˆ (t)]
(20)
t=1
By setting the partial derivation of (20) with respect to α0 to zero, the update of α0(k) is expressed as α0(k) =
2T N + q2 − 1 2 E Y − Ψ G F
(21)
2 where · F denotes the Frobenius norm for matrices, E{Y − Ψ G F } = Y − Ψ U (k) + (α −1 )(k−1) L [1 − (α −1 )(k−1) (k) ], U (k) = [μ(1)(k) , μ(2)(k) , F
0
l=1
0
ll
. . . , μ(2T )(k) ] and ll(k) is the lth diagonal element of Σ (k) . Similarly, we can obtain the update of α (k) αl(k)
L = + 2
2 1 + 2E G(l, :)2 − (1 + 2T ) l = 1, 2, . . . , L
(22)
Circuits Syst Signal Process
2 where ·2 denotes the Euclidean norm for vectors and E{G (l, :)2 = U(l, :)22 + (k)
ll . Subsequently, we focus on the estimation of ε(k) . Denote A˜ = ΦA, B˜ = ΦB, C˜ = ΦC, Δ(k) = diag(ε (k) ) and (Δ2 )(k) = diag(ε (k) ε(k) ) where denotes the Schur–Hadamard product. For ε(k) , by (19) and ignoring independent terms, we can deduce that the maximum of E{log p(Y/α0 , ε (k) , G) p(ε(k) )} is the minimum of the following form 2T 2 y(t) − Ψ gˆ (t) E 2
t=1 2T 2 ˜ (k) + C(Δ ˜ 2 )(k) μ(t) = y(t) − A˜ + BΔ 2
t=1
H ˜ (k) + C(Δ ˜ 2 )(k) Σ (k) A˜ + BΔ ˜ 2 )(k) ˜ (k) + C(Δ + Tr A˜ + BΔ where Tr {·} denotes the matrix trace. Based on the following equation Tr MΓ H (d)NΓ (d) = d H (M T N)d
(23)
(24)
where Γ (d) = diag(d), (23) can be further written as (ε T )(k) D(k) ε(k) + [(ε ε)T ](k) F(k) (ε ε)(k) + (ε T )(k) Π (k) (ε ε)(k) − 2(P T )(k) ε (k) − 2(QT )(k) (ε ε)(k) (25) where
(k) H D(k) = B˜ B˜ [ UU H + Σ (k) ]T (k) H F(k) = C˜ C˜ [ UU H + Σ (k) ]T H T (k) H Π (k) = C˜ B˜ Σ (k) + B˜ C˜ Σ T +2
2T
H ˜ Cdiag ˜ diag μ(t)(k) B μ(t)(k)
t=1 2T H H ˜ (k) − P (k) = diag B˜ AΣ diag μ(t)(k) B˜ [y(t) − Ψ μ(t)(k) ] t=1 2T H H ˜ (k) − diag μ(t)(k) C˜ [y(t) − Ψ μ(t)(k) ] Q(k) = diag C˜ AΣ t=1
H ˜ (k) and diag C˜ H AΣ ˜ (k) denote the column vectors composed where diag B˜ AΣ
H ˜ (k) and C˜ H AΣ ˜ (k) , respectively. of the diagonal elements of matrices B˜ AΣ
Circuits Syst Signal Process
Therefore, ε is updated by ε = arg
min
"L ! ε∈ − τ2 , τ2
ε T Dε+(ε ε)T F(ε ε)+ε T Π(ε ε)−2P T ε−2QT (ε ε) (26)
As one may note, the mean and covariance are estimated by the current values of α0 , α and ε according to (17) and (18) and then the updates of α0 , α and ε are given by (21), (22) and (26), respectively. In ECM, an iterative procedure is performed between (17) and (18) and (21) and (22) and (26). (k) (k+1) (k) (k) = f (α0 ) and Υ (α0 ) is the gradient of In MA, denote lim α0 = αˆ 0 , α0 (k)
k→∞
(k)
f (α0 ) at α0 . By the Taylor expansion at αˆ 0 , we have the following approximate formulas in the kth iteration
κ (k+1)
α0(k+1) − αˆ 0 ≈ Υ α0(k) − αˆ 0 (k+1) (k) (k) (k−1) = α0 − α0 ≈ Υ κ (k) = Υ α0 − Υ α0
(27) (28)
It can be deduced from (27) and (28) that (k)
αˆ 0 = α0 +
∞ l=1
(k)
κ (k+l) ≈ α0 +
∞
(k) Υ l α0 κ (k)
(29)
l=1
Then, we have (k+1)
α0 (k)
(k) (k) (k) − α0 = Υ (αˆ 0 ) α0 EC M + α0 (k)
(30)
where α0 is the estimation of the kth iteration of MA and α0 EC M is the estimation of the kth iteration of ECM. Note that the estimation of α0 is obtained by using the estimation of ECM at each iteration. As a result, the foundation of implementing MA is ECM. Following the similar procedure, it is easy to obtain the updates of α and ε in MA. ECM has slow convergence rate in the late iterations, while MA has fast convergence rate. ECM can converge to a stable value but MA cannot guarantee a stable convergence. Hence, MA and ECM are a “perfect” combination in order to guarantee the convergence rate and the convergence stability. To take advantages of MA and ECM, we adopt ECM in the early iterations and use MA instead of ECM in the case that the convergence rate becomes slow. Since MA cannot guarantee the convergence stability, ECM is again employed to guarantee the stability. Two main issues must be addressed for the implementation of the proposed method. The first issue needs to be clarified if the convergence rate becomes slow. Due to the property of MA, we replace ECM with MA if the following criterion is satisfied
Circuits Syst Signal Process
υ (k+1) − υ (k) ≤γ υ (k)
(31)
2
where υ (k) = [α0(k) ; α (k) ; ε (k) ]. Then, an additional issue is considered based on the fact that the ML function cannot increase stably in MA. We transform MA into ECM for the convergence stability when MA emerges a small-scale fluctuation. The specific steps of R-MA-ECM are given as follows. Initialize k = 0, γ = 10−3 , q = 10−4 and the hyperparameters. Calculate μ(t), t = 1, 2, . . . T , and Σ in terms of (17) and (18). Update ε according to (21) , (22) and (26), respectively. (k+1)α0 , α(k)and υ −υ If υ (k) ≤ γ , the iteration of MA is performed. Otherwise, go to step 2 (6). (5) If the ML function can increase stably, update α0 , α and ε and return to step (4). (k+1) α −α (k) (6) If α (k) ≤ γ , stop the iteration. Otherwise, set k = k + 1 and return to 2 step (2).
(1) (2) (3) (4)
3.3 Discussion At each iteration, a L × L matrix inversion is required when updating Σ. The computational complexity of solving (18) is derived as O(L 3 ). To reduce the computational complexity, the Woodburg inversion identity is exploited to have (α0 Ψ H Ψ + Λ−1 )−1 = Λ − ΛΨ H WΨ Λ where W = α0−1 I N + Ψ ΛΨ H . Since maximum K entries of ε are only nonzero, which correspond to the locations of K sources, and others are zero, the dimensions of ε, D, F, Π, P and Q are reduced to K and K × K for alleviating the computational complexity. By setting the partial derivative of (26) with respect to ε to zero, we have ε ≈ ε˜ = 2[2D + 4F + 3Π − 4diag(Q)]−1 P. It is worth pointing that ε needs to satisfy the constraint, ε ∈ [− τ2 , τ2 ] L , so that we have ⎧ τ τ ⎪ ⎨ − 2 i f ε˜l < − 2 i f − τ2 ≤ ε˜l ≤ εl = ε˜l ⎪ ⎩τ i f ε˜l > τ2 2
τ 2
(32)
It is clear that the optimal estimation of R-MA-ECM is ultimately obtained in ECM. Since the function p G, α0 , α, ε/ Y monotonously increases at each iteration by the property of ECM, R-MA-ECM can guarantee the convergence. Furthermore, the logarithm L(α0 , α, ε) can be given analytically as # L(α0 , α, ε) = log p(Y/α0 , α, ε) = log p(Y/w, ε) p(w/α0 , α, ε)dw
2T 1 H −1 =− y (t)W y(t) const + log |W| + 2 t=1
(33)
Circuits Syst Signal Process
and it is a convex function. All conditional maximizations are unique, and all conditional functions are space-filling. As a result, R-MA-ECM can guarantee the convergence uniqueness. Regarding the complexity of R-MA-ECM, since MA has faster convergence rate and much lower complexity than ECM, we only consider the complexity of ECM. Due to utilizing the Woodburg inversion identity and defining εl in (32), the complexity of ECM can be significantly reduced. Instead of a complicated M-step of EM, several CM-steps of ECM are exploited to avoid high complexity, especially if M-step requires iteration. Hence, the complexity of R-MA-ECM is quite satisfactory, which is verified in the simulation.
4 Simulation In this section, we present simulation results to illustrate the superior performance of the proposed method with comparisons to OGSBI-SVD, l1 -SVD and CS-MUSIC in the scenario with known source number. For all simulations, an ULA of M = 8 with interelement spacing d = λ2 is employed. In R-MA-ECM, the hyperparameters α0 , α 1 2T H and ε are, respectively, initialized as 2T 2T , 2T t=1 Ψ y(t) and 0. Specifit=1 var{y(t)}
cally, the proposed method is insensitive to the initializations of the hyperparameters. The DOA space [−90◦ , 90◦ ] is uniformly discretized with the grid intervals τ = 1 as default. All simulation results are obtained from 100 Monte Carlo runs. In the simulation, the root-mean-squared error (RMSE) is the significant performance index, which is defined as $ RMSE =
100 K i=1
k=1
(θ¯k,i − θk ) 100K
2
(34)
where θ¯k,i is the estimate of θk in the ith Monte Carlo run. In the first simulation, we show the probabilities of success of four methods and MUSIC, which is set as a representation of non-CS estimation methods, versus SNR and the number of snapshots. Since requirements of DOA estimation depend on the application, it is appropriate to compare the probabilities of success with non-CS estimation methods. Consider three sources impinging on the ULA from directions −40.2◦ , 3.3◦ and 60.7◦ where the latter two sources are coherent and the first source is independent of other sources. Figure 2 presents the probabilities of success versus SNR with the fixed number of snapshots 100. The probabilities of success versus the number of snapshots are shown in Fig. 3 with the fixed SNR 5 dB. The following facts can be acquired from Figs. 2 and 3 that although MUSIC has the lowest probability of success due to the grid error, all methods can estimate DOAs accurately at each Monte Carlo run for high SNR or large number of snapshots and R-MA-ECM outperforms OGSBI-SVD, l1 -SVD, CS-MUSIC and MUSIC in terms of the probability of success for low SNR or small number of snapshots. RMSE of four methods is analyzed in the second simulation. One independent source from −83.6◦ and two coherent sources from 10.3◦ and 85.7◦ impinge on the
Circuits Syst Signal Process 1
Probability of Success
0.9 0.8 0.7 0.6 R−MA−ECM OGSBI−SVD l1−SVD CS−MUSIC MUSIC
0.5 0.4 0.3 −15 −13 −11
−9
−7
−5
−3
−1
1
3
5
SNR/dB Fig. 2 Probability of success versus SNR for R-MA-ECM, OGSBI-SVD, l1 -SVD , CS-MUSIC and MUSIC with number of snapshots 100
1
Probability of Success
0.9
R−MA−ECM OGSBI−SVD l1−SVD CS−MUSIC MUSIC
0.8 0.7 0.6 0.5 0.4 0.3 0.2 20
40
60
80
100
120
140
160
180
200
220
number of snapshots Fig. 3 Probability of success versus number of snapshots for R-MA-ECM, OGSBI-SVD, l1 -SVD, CSMUSIC and MUSIC with SNR 5 dB
ULA. Figures 4 and 5 illustrate the RMSE of four methods versus SNR and the number of snapshots, respectively. We see from Figs. 4 and 5 that R-MA-ECM has more precise estimation than OGSBI-SVD, l1 -SVD and CS-MUSIC. More importantly, this phenomenon is significant in the case of low SNR or small number of snapshots.
Circuits Syst Signal Process −5 R−MA−ECM OGSBI−SVD l1−SVD CS−MUSIC
−10
RMSE/dB
−15 −20 −25 −30 −35 −40
2
5
8
11 SNR/dB
14
17
20
Fig. 4 RMSE versus SNR for R-MA-ECM, OGSBI-SVD, l1 -SVD and CS-MUSIC with number of snapshots 100 −15 R−MA−ECM OGSBI−SVD l1−SVD CS−MUSIC
RMSE/dB
−20
−25
−30
−35
−40 20
70
120 170 220 number of snapshots
270
320
Fig. 5 RMSE versus number of snapshots for R-MA-ECM, OGSBI-SVD, l1 -SVD and CS-MUSIC with SNR 10 dB
Subsequently, the relation between RMSE of four methods and grid interval is 3 shown in Fig. 6. It is well known that the lower bound can be easily calculated as τ12 in the case of large number of snapshots by assuming that the true DOA is uniformly distributed, where the lower bound is shared in the on-grid methods. As can be seen from Fig. 6, the RMSE of R-MA-ECM and OGSBI-SVD can be below the lower bound for l1 -SVD as long as the grid interval is no less than 1. Moreover, the RMSE of CS-MUSIC and l1 -SVD becomes greater with increase in the grid interval, whereas the RMSE of other two methods remains almost intact. This is
Circuits Syst Signal Process −34 −36 −38
RMSE/dB
−40
R−MA−ECM OGSBI−SVD l1−SVD lower bound for l1−SVD CS−MUSIC
−42 −44 −46 −48 −50 −52 −54 0.5
1
1.5
2
2.5
3
3.5
grid interval / deg Fig. 6 RMSE versus grid interval for R-MA-ECM, OGSBI-SVD, CS-MUSIC, l1 -SVD and lower bound Table 1 Computation time versus the grid interval Grid interval
Time (sec) l1 -SVD
CS-MUSIC
OGSBI-SVD
R-MA-ECM
1
0.415
0.248
0.673
0.146
2
0.264
0.129
0.196
0.083
4
0.187
0.057
0.054
0.017
because the off-grid model proposed in this paper can overcome the modeling error which may cause disastrous consequences in the on-grid model. In order to compare the complexity of four methods, the computation time versus the grid interval is shown in Table 1 with the fixed SNR 5 dB and number of snapshots 100, whereas Table 2 gives the computation time versus the number of snapshots with the fixed SNR 5 dB. The computation time is obtained by the MATLAB 7.8 (2009a) on a 2.8 G 4 GB PC. It is easy to see from Tables 1 and 2 that the computation time of R-MA-ECM is shorter than that of other three methods and the complexity of R-MA-ECM is the lowest among four methods. Moreover, the computation time decreases as the grid becomes coarser and the number of snapshots becomes small. The convergence behavior is analyzed in the last simulation. Since CS-MUSIC is a subspace-based method, we only compare the convergence behavior of R-MA-ECM, OGSBI-SVD, and l1 -SVD. Figure 7 shows the RMSE as a function of the number of iterations with the fixed SNR 10 dB and number of snapshots 100. It is indicated in Fig. 7 that the convergence of three methods can tend to stability with the number of iterations increasing. Moreover, R-MA-ECM has sharp fall in the fifth iteration and converges faster than other two methods.
Circuits Syst Signal Process Table 2 Computation time versus number of snapshots Number of snapshots
Time (sec) l1 -SVD
CS-MUSIC
OGSBI-SVD
R-MA-ECM
100
0.439
0.104
0.146
0.098
150
0.443
0.137
0.152
0.118
200
0.451
0.162
0.157
0.131
0 −5
R−MA−ECM OGSBI−SVD l1−SVD
RMSE/dB
−10 −15 −20 −25 −30 −35
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 number of iterations
Fig. 7 RMSE versus number of iterations for R-MA-ECM, OGSBI-SVD, and l1 -SVD with SNR 10 dB and number of snapshots 100
5 Conclusion In this paper, a novel real-valued sparse Bayesian method based on the off-grid model is proposed for DOA estimation in CS. To reduce the modeling error, the off-grid model is reformulated in terms of the second-order Taylor expansion. A unitary transformation is utilized to yield a real-valued problem for applying the Bayesian perspective in CS conveniently. Joint sparsity is exploited for DOA estimation under the condition that sources are independent and share the same sparse prior among different snapshots. Specifically, a full posterior density function is provided in the Bayesian framework. The proposed method can guarantee both the convergence rate and the convergence stability in the iterative procedure. Simulation results show that the proposed method can stably converge and have better estimation performance in terms of probability of success and RMSE as well as less computational complexity. Although our work has some relevance to the Bayesian CS [8], in which CS is considered from a Bayesian perspective, it differs from existing works in the following two main aspects. First, current works for off-grid DOA estimation only consider a single iteration, e.g., STLS [23] and EM [22], while our work has a clear connection between MA and ECM. Second, our work can perform off-grid DOA estimation in
Circuits Syst Signal Process
real field by a unitary transformation. Our future research topic is to implement RMA-ECM in estimating wideband sources and develop a fast version of R-MA-ECM. Acknowledgments
This work was supported by Aviation Science Foundation of China (201401P6001).
References 1. L. Bai, C.Y. Peng, S. Biswas, Association of DOA estimation from two ULAs. IEEE Trans. Instrum. Meas. 57(6), 1094–1101 (2008) 2. E. Cands, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006) 3. E. Cands, T. Tao, Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006) 4. D.L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006) 5. M.A. Davenport, M.B. Wakin, Analysis of orthogonal matching pursuit using the restricted isometry property. IEEE Trans. Inf. Theory 56(9), 4395–4401 (2010) 6. J.D. Dai, W.C. Xu, D. Zhao, Real-valued DOA estimation for uniform linear array with unknown coupling. Signal Process. 92(9), 2056–2065 (2012) 7. K.C. Huarng, C.C. Yeh, A unitary transformation method for angle-of-arrival estimation. IEEE Trans. Signal Process. 39(4), 975–977 (1991) 8. S. Ji, Y. Xue, L. Carin, Bayesian compressive sensing. IEEE Trans. Signal Process. 56(6), 2346–2355 (2008) 9. J.M. Kim, O.K. Lee, J.C. Ye, Compressive MUSIC: revisiting the link between compressive sensing and array signal processing. IEEE Trans. Inf. Theory 58(1), 278–301 (2012) 10. H. Krim, M. Viberg, Two decades of array signal processing research: the parametric approach. IEEE Signal Process. Mag. 13(4), 67–94 (1996) 11. K. Lee, Y. Bresler, M. Junge, Subspace methods for joint sparse recovery. IEEE Trans. Inf. Theory 58(6), 3613–3640 (2012) 12. Z.M. Liu, Y.Y. Zhou, A unified framework and sparse Bayesian perspective for direction-of-arrival estimation in the presence of array imperfections. IEEE Trans. Signal Process. 61(15), 3786–3798 (2013) 13. D. Malioutov, M. Cetin, A.S. Willsky, A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Transa. Signal Process. 53(8), 3010–3022 (2005) 14. G.J. McLachlan, T. Krishnan, The EM Algorithm and Extensions (Wiley, New Jersey, 2008) 15. E.T. Northardt, I. Bilik, Y.I. Abramovich, Spatial compressive sensing for direction-of-arrival estimation with bias mitigation via expected likelihood. IEEE Trans. Antennas Propag. 61(7), 3828–3838 (2013) 16. S. Pazos, M. Hurtado, C. Muravchik, DOA estimation using random linear arrays via compressive sensing. IEEE Trans. Latin Am. 12(5), 859–863 (2014) 17. R. Roy, T. Kailath, ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speed Signal Process. 37(7), 984–995 (1989) 18. R.O. Schmidt, Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 34(3), 276–280 (1986) 19. Z. Sun, Z. Yang, Study of nonlinear parameter identification using UKF and maximum likelihood method, in Proceedings of the IEEE International Conference on Control Application (2010), pp. 671–676 20. M.E. Tippong, Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001) 21. J.A. Tropp, A.G. Gilbert, M.J. Strauss, Algorithms for simultaneous sparse approximation. Party I: Greedy pursuit. Signal Process. 86(3), 572–588 (2006) 22. Z. Yang, L. Xie, C. Zhang, Off-grid direction of arrival estimation using sparse Bayesian Inference. IEEE Trans. Signal Process. 61(1), 38–43 (2013) 23. H. Zhu, G. Leus, G. Giannakis, Sparsity-cognizant total least-squares for perturbed compressive sampling. IEEE Trans. Signal Process. 59(5), 2002–2016 (2011) 24. Y. Zhang, Z.F. Ye, X. Xu, N. Lu, Off-grid DOA estimation using array covariance matrix and blocksparse Bayesian learning. Signal Process. 98, 197–201 (2014)