Circuits Syst Signal Process DOI 10.1007/s00034-017-0633-3 SHORT PAPER
Line Spectral Estimation Based on Compressed Sensing with Deterministic Sub-Nyquist Sampling Shan Huang1 · Hong Sun1 · Haijian Zhang1 · Lei Yu1
Received: 4 January 2017 / Revised: 6 August 2017 / Accepted: 7 August 2017 © Springer Science+Business Media, LLC 2017
Abstract As an alternative to the traditional sampling theory, compressed sensing allows acquiring much smaller amount of data, still estimating the spectra of frequency-sparse signals accurately. However, the previous methods focus more on single measurement vector and default to using impractical random sampling. In this paper, we employ a deterministic and simple sampling scheme, that is, using three subNyquist channels with pairwise coprime undersampling ratios. A multi-task model is utilized to unite the three-channel samples and address the multiple measurement vector problem. The model is solved by the proposed complex-valued multi-task algorithm based on variational Bayesian inference. Simulations show that this method is feasible and robust at quite low sampling rates. Keywords Line spectral estimation · Compressed sensing · Sub-Nyquist sampling
This work was supported by the National Natural Science Foundation of China with Grant Number 61501335.
B
Haijian Zhang
[email protected] Shan Huang
[email protected] Hong Sun
[email protected] Lei Yu
[email protected]
1
School of Electronic Information, Wuhan University, Wuhan, China
Circuits Syst Signal Process
1 Introduction Line spectral estimation has numerous applications in sonar, radar, underwater surveillance, communications, geophysical exploration, speech analysis, nuclear physics and other fields [12,14,30,31]. In general, the sampling rate for a signal must be greater than twice the highest frequency (i.e., the Nyquist rate). However, when the bandwidth of the signal is large, the cost of sampling hardware will be greatly increased. The emerging compressed sensing (CS) theory goes against the common knowledge in data acquisition and asserts that one can recover sparse signals from far fewer samples or measurements than traditional methods [5,7]. Many researchers have utilized CS to estimate the spectra of frequency-sparse signals [10,26,32]. A source localization method based on a sparse representation of sensor measurements with a redundant basis was proposed in [15]. The authors in [2] addressed the problem of estimating spectral lines from irregularly sampled data in a sparse signal representation framework. The uniqueness conditions of the sparse solution with different patterns of samples were analyzed. In [6], the effect of “basis mismatch” caused by grid discretization was discussed. To deal with basis mismatch, some articles used grid refinement to approximate the true frequencies [8,9,19,29]. The methods based on atomic norm transform the line spectral estimation into a convex semidefinite program optimization, which deals with continuous-valued frequencies and completely eliminates basis mismatch [3,20,28]. However, these methods usually require random sampling, which is difficult to implement. For example, a new type of data acquisition system called a random demodulator is studied to ensure the randomness of sampling in [22]. Meanwhile, most researchers focus on single measurement vector (SMV) or direction of arrival (DOA) estimation in the spatial domain. In fact, the multiple measurement vectors (MMV) in the time domain are very common in practical applications, such as speech and audio signal processing and communications. On the other hand, a deterministic sampling scheme called coprime sampling is proposed and analyzed [16,25]. As a simple and effective deterministic undersampling scheme, coprime sampling allows us to sample a signal sparsely and estimate the parameters of the signal at high resolutions [17]. It seems logical to introduce coprime sampling into the CS theory. In this paper, we focus on line spectral estimation with deterministic sub-Nyquist sampling. Unlike [25], we use three instead of two undersampled channels. The MMV case is considered to make full use of the sample data. Three-channel samples are integrated through a multi-task model. Then, an algorithm based on variational Bayesian inference is employed to solve the multi-task model. The proposed method requires only three sub-Nyquist channels with pairwise coprime undersampling ratios; the hardware is convenient to implement. The paper is organized as follows: Sect. 2 gives the sampling scheme and the corresponding model. Section 3 demonstrates the proposed algorithm. Simulation results are shown in Sect. 4. The last section draws conclusions.
2 Sampling Strategy In the line spectral estimation problem, the observed signal can be considered as a summation of K complex sinusoids:
Circuits Syst Signal Process
Fig. 1 Process diagram of the proposed sampling scheme
y(t) =
K
ck e jωk t ,
(1)
k=1
√ where j = −1, ωk ∈ [0, 2π ) and ck denote the angular frequency and the complex amplitude of the kth component, respectively. When t = 1, 2, . . . , M, it implies normal sampling, which is studied in conventional methods such as the MUSIC algorithm [18]. In the methods based on CS, t is selected at random from the index set Δ [N ] = {1, 2, . . . , N } (N is related to the resolution). However, this pattern of sampling often leads to complex hardware. The proposed deterministic scheme is to sample through three sub-Nyquist channels with coprime undersampling ratios p, q, r . In other words, we need the samples with indices T = { p, 2 p, . . .} ∪ {q, 2q, . . .} ∪ {r, 2r, . . .} .
(2)
It is worth mentioning that sampling through two channels with coprime undersampling ratios sometimes also yields correct results, but three channels can guarantee a high probability of success. This result is consistent with the conclusion of [11]. The process diagram of sampling is shown in Fig. 1. After sampling, the samples are in chronological order, and we select sequential M samples with indices t1 , t2 , . . . , t M to constitute a column vector T y = y(t1 ) y(t2 ) · · · y(t M ) ,
(3)
where [∗]T denotes the transpose operation. Assume that the frequencies are aligned with a uniform grid, i.e.,
Circuits Syst Signal Process
ωn = 2π n/N , n = 1, 2, . . . , N .
(4)
The observation model can be written more compactly as y = Φ s,
(5)
where ⎡
e jω1 t1 ⎢ e jω1 t2 ⎢ Φ=⎢ . ⎣ ..
e jω1 t M
· · · e jω N t1 · · · e jω N t2 .. .. . . · · · e jω N t M
⎤ ⎥ ⎥ ⎥, ⎦
(6)
and s is a K -sparse vector composed of ck and zeros. In general, M < N is set and (5) is solved as a problem of sparse recovery. To improve the probability of success, we utilize more samples to form multiple tasks and synthesize the effects of these tasks, namely yl = Φ l sl , l = 1, 2, . . . , L ,
(7)
where T yl = y(tl ) y(tl+1 ) · · · y(tl+M−1 ) , ⎡ 1 ··· 1 ⎢ e jω1 (tl+1 −tl ) · · · e jω N (tl+1 −tl ) ⎢ Φl = ⎢ .. .. .. ⎣ . . .
⎤ ⎥ ⎥ ⎥, ⎦
(8)
(9)
e jω1 (tl+M−1 −tl ) · · · e jω N (tl+M−1 −tl )
and sl = diag e jω1 tl , e jω2 tl , . . . , e jω N tl ·s. The total number of samples is L + M −1. Note that all of sl share the same sparsity profile and Φ l repeat after a certain period. M is expected to be as large as possible, but an appropriate value of M must ensure Φ l not to contain duplicate rows. The joint estimation can achieve satisfactory results, even though some Φ l may not have good property. Next we give an example in order to clearly illustrate our sampling scheme. Given the values of p, q, r, N and L, the initial value of M can be selected as M = Np +
N N N N N N + − − + − q r pq pr qr pqr , and then check whether each sensing matrix l contains duplicate rows. If one of the sensing matrices has duplicate rows, the value of M will be reduced. Suppose that the three undersampling ratios are p = 7, q = 8 and r = 9, and the number of discrete grid points is N = 100, the configuration of the samples is
Circuits Syst Signal Process
⎛ ⎜ ⎜ ⎜ ⎜ y1 · · · y L = ⎜ ⎜ ⎜ ⎝
y(7) y(8) y(9) y(14) y(16) .. .
L y(8) y(9) y(14) y(16) y(18) .. .
y(9) y(14) y(16) y(18) y(21) .. .
··· ··· ··· ··· ··· .. .
⎞⎫ ⎪ ⎪ ⎪ ⎟⎪ ⎪ ⎟⎪ ⎪ ⎟⎬ ⎟ ⎟ M. ⎟⎪ ⎪ ⎟⎪ ⎪ ⎠⎪ ⎪ ⎪ ⎭
(10)
When 10 tasks are used, i.e., L = 10, it is proper to choose M = 33 to prevent the corresponding sensing matrices from having duplicate rows. In this case, the measurement vector and the sensing matrix of the tenth task are y10 = [y(28), y(32), . . . , y(126)]T ,
(11)
and ⎡ Φ 10
1
⎢ e jω1 4 ⎢ =⎢ . ⎣ .. e jω1 98
⎤ ··· 1 · · · e jω N 4 ⎥ ⎥ .. ⎥ . .. . . ⎦ jω · · · e N 98
(12)
Obviously, if M = 34, the last row of Φ 10 is the same as the first row. The sensing matrix Φ l amounts to picking partial rows from the N × N Fourier matrix. The sensing property of this deterministic partial Fourier matrix approximates a random partial Fourier matrix, which has been proved to be appropriate as a CS matrix [4].
3 Proposed Algorithm In this section, a complex-valued multi-task algorithm based on variational Bayesian inference is proposed to solve the above model. In [13], the multi-task Bayesian CS algorithm utilized empirical Bayesian analysis to recover multiple real-valued sparse solutions. The authors of [27] proposed a complex multi-task Bayesian CS algorithm that jointly treats the real and imaginary parts. We directly address the problem within the complex hierarchical Bayesian framework. Assume the measurement noise to be complex Gaussian with zero-mean and variance equal to β −1 , the model (7) can be rewritten as yl = Φ l sl + l , l = 1, 2, . . . , L .
(13)
The likelihood function for the parameters s and β may be expressed as p yl |sl , β = (π/β)−M exp −β yl − Φ l sl
2 2
!
.
(14)
Circuits Syst Signal Process
The hierarchical Gaussian prior is typically imposed on sl in sparse Bayesian learning to induce sparsity. Denote the prior variance of the ith element of sl as αi−1 , and the prior distribution of sl is p (sl |α) =
! 1 H , exp −s As l l π N | A|−1
(15)
T where α = α1 · · · α N and A = diag (α). Gamma priors are placed on the unknown hyperparameters α and β, i.e., p (α|a, b) =
N "
Gamma (αi |a, b),
(16)
i=1
p (β|c, d) = Gamma (β|c, d) .
(17)
The parameters a, b, c and d are generally set to very small values (e.g., a = b = c = d = 10−6 ), which amounts to assuminguninformative priors for α and β [21]. Define Y = y1 · · · y L and S = s1 · · · s L ; the joint probability of measurements, parameters and hyperparameters is p (Y , S, α, β) = p (Y |S, β) · p (S|α) · p (α|a, b) · p (β|c, d) =
L L " " p yl |sl , β · p (sl |α) · p (α|a, b) · p (β|c, d) . l=1
(18)
l=1
Bayesian inference is based on Bayes’ theorem, that is, the posterior probability p (θ |Y ) =
p (Y , θ ) , p (Y )
(19)
where θ denotes the set of all unknowns such that θ = {S, α, β}, p (Y ) = # p (Y , θ )dθ . However, in most cases, the computation of the marginal likelihood p (Y ) is intractable. Thus, we resort to the variational inference methodology, which provides a family of distributions q (θ ) that approximate the posterior p (θ|Y ) [24]. The log-likelihood ln p (Y ) can be decomposed as ln p (Y ) = L (θ ) + K L (q (θ ) || p (θ |Y )) ,
(20)
where $ L (θ ) =
q (θ ) ln
p (Y , θ ) dθ q (θ )
(21)
is a rigorous lower bound on ln p (Y ) and $ K L (q(θ )|| p(θ|Y )) = −
q (θ ) ln
p (θ|Y ) dθ q (θ )
(22)
Circuits Syst Signal Process
is the Kullback–Leibler divergence between q(θ ) and the posterior distribution p (θ |Y ) [1]. The best approximation of the posterior distribution can be found by maximizing L (θ) or minimizing K L (q (θ ) || p (θ |Y )), i.e., q (θ ) = arg max L (θ) = arg min K L (q (θ ) || p (θ |Y )) . q(θ)
(23)
q(θ)
To solve this optimization problem, we use the mean-field approximation with [23] q (S, α, β) = q (S) q (α) q (β) .
(24)
By applying this factorization to (23), the approximate posterior distributions of S, α and β can be computed as ln q(S) = ln p (Y , S, α, β)q(α)q(β) + const, ln q(α) = ln p (Y , S, α, β)q(S)q(β) + const,
(26)
ln q(β) = ln p (Y , S, α, β)q(S)q(α) + const,
(27)
(25)
and
#
where f (x)g(x) = f (x)g (x) d x denotes the expected value if f (x) with respect to g(x). Substituting (18) into (25), after some arrangement we find that the vector sl obeys a complex Gaussian distribution, i.e., q(sl ) = CN sl |μl , Σ l , l = 1, 2, . . . , L .
(28)
The mean μl and covariance matrix Σ l are given by μl = β Σ l Φ lH yl , !−1 Σ l = β Φ lH Φ l + A .
(29) (30)
According to (18) and (26), it can be shown that the posterior probability density of α is q(α) =
N "
! Gamma αi |a, ˜ b˜i ,
(31)
i=1
where a˜ = a + L , % L ' & &2 &sl,i & , b˜i = b + l=1
(32) (33)
Circuits Syst Signal Process
sl,i is the ith element of sl . Similarly, we obtain ! q(β) = Gamma β|c, ˜ d˜ ,
(34)
where c˜ = c + L M, % L ˜ yl − Φ l s l d=d+
(35)
' 2 2
.
(36)
l=1
Utilizing the property of Gamma distribution, the required expected values of the hyperparameters can be calculated as a˜ , b˜i c˜ β = . d˜
αi =
(37) (38)
Based on the above results, the procedure of the proposed algorithm can be summarized as follows: 1) Initialize the variables μl , Σ l , α and β. Set the iteration count to 0. 2) According to (31)–(38) and the current estimated values of μl and Σ l , update the posterior distributions of α and β. 3) According to (28)–(30) and the current posterior distributions of α and β, update the estimated values of μl and Σ l . 4) Return to Step 2) until the iteration count reaches the specific value or the iteration is converged. The lower bound L (θ ) continuously increases as the iterative procedure is repeated, until it reaches a steady value. So we can judge whether the iteration is converged by calculating L = ln p (Y |S, β) + ln p (S|α) + ln p (α) + ln p (β) − ln q (S) − ln q (α) − ln q (β) .
(39)
After using this algorithm, the lowest several valleys of α indicate the positions of the frequencies contained in the signal. If the true frequency does not fall on the grid, the algorithm usually finds the nearest grid point. In the first step, α may be initialized to a uniform nonzero value and β is related to the noise level. μl , Σ l can be obtained from α and β. By optimizing some parts of the algorithm, we can effectively speed up the algorithm. From the above steps, we can see that the computational burden of the proposed algorithm is concentrated in matrix inversion in (30), which has a computational complexity of O(L N 3 ). A simple and crude approximation is to only invert the diagonal
Circuits Syst Signal Process
elements of Σ l and calculate the trace terms. With this approximation, the matrices Σ l become the same, and the computational complexity is significantly reduced to O(N ). Another way to reduce the amount of computation is to remove some columns from the matrices Φ l according to αi during the iteration. The dimensions of the corresponding matrices and vectors gradually decrease as the pruning proceeds.
4 Simulation Results We make experiments to verify the effect of multiple tasks relative to single task. The three undersampling ratios are set to p = 7, q = 8 and r = 9. According to the analysis in Sect. 2, M = 33 and N = 100 are fixed. Additive complex white Gaussian noise at SNR = 20 dB is added to the measurements. The (pseudo)power spectra with respect to different numbers of tasks L = 1 and 20 are plotted. Meanwhile, in spite of impracticality of random sampling, we construct measurement vectors with random samples in the program to compare with our method. In Fig. 2a, the frequencies are set to 0.13, 0.15 and 0.17, which are exactly on the grid. The corresponding amplitudes are 0.4, 0.6 and 0.8, respectively. In Fig. 2b, the frequencies are set to 0.133, 0.152 and 0.169 with unified amplitudes 0.6. When only one task is utilized, the three frequencies
L=1
20 10
Porposed method Random sampling True frequencies
10 0
P
0
P
L=20
20 Porposed method Random sampling True frequencies
−10
−10
−20
−20
−30
−30
−40
−40 0
0.1
0.2
0.3
0.4
0.5
f
0
L=1
10
0.1
0.2
0
0.5
Porposed method Random sampling True frequencies
0 −10
P
P
−10
0.4
L=20
10 Porposed method Random sampling True frequencies
0.3
f
(a)
−20
−20
−30
−30
−40
−40 0
0.1
0.2
0.3
f
0.4
0.5
0
(b)
0.1
0.2
0.3
0.4
0.5
f
Fig. 2 Estimated power spectra with respect to different numbers of tasks. a f k = 0.13, 0.15, 0.17 b f k = 0.133, 0.152, 0.169
Circuits Syst Signal Process 0.98
Probability of success
0.97 0.96 0.95 0.94 0.93 Proposed method Random sampling MUSIC
0.92 0.91 10
15
20
25
30
SNR Fig. 3 Probabilities of success for different SNRs
are not well distinguished. When L = 20, the performance is significantly improved. The result of random sampling is slightly better than the proposed method when L = 1. Then, we test the performance of the proposed method in different noisy environment. M = 33, N = 100 and L = 50 are fixed and the SNR varies from 10 to 30 dB. The signals contain K = 3 frequency components with random amplitudes and random phase angles. The frequencies are randomly distributed in the interval (0,1], and the amplitudes are randomly distributed in [0.5,1]. To keep it simple, we assume K is known, so the K frequencies corresponding to maximum K peaks in power spectrum are estimated results. If the deviations of estimated frequencies from the true frequencies are all within 0.5/N , we say that this trial is successful. The probabilities of success are obtained from 500 trials for each SNR. The MUSIC algorithm using conventional Nyquist sampling is also considered for comparison. The same number of samples is used for our method and MUSIC. As shown in Fig. 3, there is little difference in the probabilities of success for the three methods. This illustrates that the proposed method has approximately the same performance as random sampling and MUSIC.
5 Conclusion In this paper, we propose a deterministic sampling scheme to replace the impractical random sampling. The sample data are collected by three sub-Nyquist channels with pairwise coprime undersampling ratios and united by a multi-task model. An algorithm based on variational Bayesian inference is proposed to solve the multi-task model. Simulations show that the proposed method possesses as good performance as MUSIC with conventional Nyquist sampling. We believe that this method can improve the practicability of CS in line spectral estimation.
Circuits Syst Signal Process
References 1. C.M. Bishop, M.E Tipping, Variational relevance vector machines. in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, (Morgan Kaufmann Publishers Inc., 2000), pp. 46–53 2. S. Bourguignon, H. Carfantan, J. Idier, A sparsity-based method for the estimation of spectral lines from irregularly sampled data. IEEE J. Sel. Top. Signal Process. 1(4), 575–585 (2007) 3. E.J. Candès, C. Fernandez-Granda, Towards a mathematical theory of super-resolution. Commun. Pure Appl. Math. 67(6), 906–956 (2014) 4. E.J. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006) 5. E.J. Candès, M.B. Wakin, An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008) 6. Y. Chi, L.L. Scharf, A. Pezeshki, A.R. Calderbank, Sensitivity to basis mismatch in compressed sensing. IEEE Trans. Signal Process. 59(5), 2182–2195 (2011) 7. D.L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006) 8. J. Fang, J. Li, Y. Shen, H. Li, S. Li, Super-resolution compressed sensing: an iterative reweighted algorithm for joint parameter learning and sparse signal recovery. IEEE Signal Process. Lett. 21(6), 761–765 (2014) 9. L. Hu, Z. Shi, J. Zhou, Q. Fu, Compressed sensing of complex sinusoids: an approach based on dictionary refinement. IEEE Trans. Signal Process. 60(7), 3809–3822 (2012) 10. S. Huang, H. Sun, L. Yu, H. Zhang, A class of deterministic sensing matrices and their application in harmonic detection. Circuits Syst. Signal Process. 35(11), 4183–4194 (2016) 11. S. Huang, H. Zhang, H. Sun, L. Yu, L. Chen, Frequency estimation of multiple sinusoids with three sub-Nyquist channels. Signal Process. 139, 96–101 (2017) 12. A. Jakobsson, P. Stoica, Combining Capon and APES for estimation of spectral lines. Circuits Syst. Signal Process. 19(2), 159–169 (2000) 13. S. Ji, D. Dunson, L. Carin, Multitask compressive sensing. IEEE Trans. Signal Process. 57(1), 92–106 (2009) 14. Y.C. Liang, Adaptive frequency estimation of sinusoidal signals in colored non-Gaussian noises. Circuits Syst. Signal Process. 19(6), 517–533 (2000) 15. D. Malioutov, M. Çetin, A.S. Willsky, A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 53(8), 3010–3022 (2005) 16. P. Pal, P.P. Vaidyanathan, Coprime sampling and the MUSIC algorithm. in Digital Signal Processing Workshop and IEEE Signal Processing Education Workshop, (2011), pp. 289–294 17. S. Qin, Y.D. Zhang, M.G. Amin, Generalized coprime array configurations for direction-of-arrival estimation. IEEE Trans. Signal Process. 63(6), 1–1 (2015) 18. R.O. Schmidt, Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 34(3), 276–280 (1986) 19. W. Si, X. Qu, Z. Qu, P. Zhao, Off-grid DOA estimation via real-valued sparse bayesian method in compressed sensing. Circuits Syst. Signal Process. 35(10), 1–17 (2016) 20. G. Tang, B.N. Bhaskar, P. Shah, B. Recht, Compressed sensing off the grid. IEEE Trans. Inf. Theory 59(11), 7465–7490 (2013) 21. M.E. Tipping, Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001) 22. J.A. Tropp, J.N. Laska, M.F. Duarte, J.K. Romberg, R.G. Baraniuk, Beyond Nyquist: efficient sampling of sparse bandlimited signals. IEEE Trans. Inf. Theory 56(1), 520–544 (2010) 23. D.G. Tzikas, A.C. Likas, N.P. Galatsanos, The variational approximation for Bayesian inference. IEEE Signal Process. Mag. 25(6), 131–146 (2008) 24. D.G. Tzikas, A.C. Likas, N.P. Galatsanos, Variational bayesian sparse kernel-based blind image deconvolution with student’s-t priors. IEEE Trans. Image Process. 18(4), 753–764 (2009) 25. P.P. Vaidyanathan, P. Pal, Sparse sensing with co-prime samplers and arrays. IEEE Trans. Signal Process. 59(2), 573–586 (2011) 26. L. Wang, L. Zhao, G. Bi, C. Wan, Novel wideband DOA estimation based on sparse bayesian learning with dirichlet process priors. IEEE Trans. Signal Process. 64(2), 1–1 (2016) 27. Q. Wu, Y.D. Zhang, M.G. Amin, B. Himed, Complex multitask bayesian compressive sensing. in IEEE International Conference on Acoustics, Speech and Signal Processing, (2014), pp. 3375–3379
Circuits Syst Signal Process 28. Z. Yang, L. Xie, On gridless sparse methods for multi-snapshot direction of arrival estimation. Circuits Syst. Signal Process. 36(8), 3370–3384 (2017) 29. Z. Yang, C. Zhang, L. Xie, Robustly stable signal recovery in compressed sensing with structured matrix perturbation. IEEE Trans. Signal Process. 60(9), 4658–4671 (2012) 30. H. Zhang, G. Bi, W. Yang, S.G. Razul, C.M.S. See, IF estimation of FM signals based on time-frequency image. IEEE Trans. Aerosp. Electr. Syst. 51(1), 326–343 (2015) 31. H. Zhang, L. Yu, G.S. Xia, Iterative time-frequency filtering of sinusoidal signals with updated frequency estimation. IEEE Signal Process. Lett. 23(1), 139–143 (2016) 32. H. Zhu, G. Leus, G.B. Giannakis, Sparsity-cognizant total least-squares for perturbed compressive sampling. IEEE Trans. Signal Process. 59(5), 2002–2016 (2011)