J Sci Comput DOI 10.1007/s10915-016-0350-2
Structure Preserving Preconditioners for Image Deblurring Pietro Dell’Acqua1 · Marco Donatelli2 · Claudio Estatico3 · Mariarosa Mazza4
Received: 31 March 2016 / Revised: 10 November 2016 / Accepted: 23 December 2016 © Springer Science+Business Media New York 2017
Abstract Regularizing preconditioners for accelerating the convergence of iterative regularization methods without spoiling the quality of the approximated solution have been extensively investigated in the last twenty years. Several strategies have been proposed for defining proper preconditioners. Usually, in methods for image restoration, the structure of the preconditioner is chosen Block Circulant with Circulant Blocks (BCCB) because it can be efficiently exploited by Fast Fourier Transform (FFT). Nevertheless, for ill-conditioned problems, it is well-known that BCCB preconditioners cannot provide a strong clustering of the eigenvalues. Moreover, in order to get an effective preconditioner, it is crucial to preserve the structure of the coefficient matrix. The structure of such a matrix, in case of image deblurring problem, depends on the boundary conditions imposed on the imaging model. Therefore, we propose a technique to construct a preconditioner which has the same structure of the blurring matrix related to the restoration problem at hand. The construction of our preconditioner requires two FFTs like the BCCB preconditioner. The presented preconditioning strategy represents a generalization and an improvement with respect to both
B
Pietro Dell’Acqua
[email protected] Marco Donatelli
[email protected] Claudio Estatico
[email protected] Mariarosa Mazza
[email protected]
1
Dipartimento di Ingegneria, Scienze Informatiche e Matematica, Università degli Studi dell’Aquila, Via Vetoio, Loc. Coppito, 67010 L’Aquila, Italy
2
Dipartimento di Scienza e Alta Tecnologia, Università degli Studi dell’Insubria, Via Valleggio 11, 22100 Como, Italy
3
Dipartimento di Matematica, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, Italy
4
Max-Planck Institute for Plasma Physics, Boltzmannstraße 2, 85748 Garching, Germany
123
J Sci Comput
circulant and structured preconditioning available in the literature. The technique is further extended to provide a non-stationary preconditioning in the same spirit of a recent proposal for BCCB matrices. Some numerical results show the importance of preserving the matrix structure from the point of view of both restoration quality and robustness of the regularization parameter. Keywords Regularization · Preconditioning · Toeplitz matrices
1 Introduction Let us consider the linear system Ax = b,
(1)
where A ∈ R N ×N and x, b ∈ R N . In the image deblurring context, the matrix A represents the blurring operator created according to the Point Spread Function (PSF) and the Boundary Conditions (BCs), x is an approximation of the true image x ∈ R N of an unknown object, and b is the detected image affected by blur and corrupted by a noise η ∈ R N , i.e. b = Ax + η. If the true image and the blurred image are of n × n pixels, then N = n 2 and the vectors x and b represent the corresponding vectorizations [21]. Image deblurring consists in computing an approximation of the true image x by means of an appropriate solution of (1). Since the singular values of A gradually approach zero without a significant gap, A is very ill-conditioned and may be singular. Linear systems of equations with a matrix of this kind are commonly referred as linear discrete ill-posed problems and require regularization [20]. The special structure of the square blurring matrix A depends on the properties of the basic blurring model, i.e. the PSF and the BCs. In this work we assume that the blurring model is space-invariant and that the BCs are defined as affine relations between the unknowns inside the Field Of View (FOV). BCs try to capture and to include into the deblurring model the unknown behaviour of the signal outside the FOV in which the detection is made [21]. Indeed, the information inside the FOV contained in the detected image b is incomplete and does not allow to restore the true image even in the (unrealistic) noiseless case. Among the BCs present in the literature, we consider the following ones. In the Zero (Dirichlet) BCs model, the image outside the FOV is supposed to be null, i.e. zero pixel-valued. In the Periodic BCs model, the image inside the FOV is periodically repeated outside the FOV. In the Reflective BCs model the image inside the FOV is reflected outside the FOV, as there were a vertical mirror along each edge. That way, the pixel values across the boundary are extended so that the continuity of the image is preserved at the boundary. In the Anti-Reflective BCs model the image inside the FOV is anti-reflected outside the FOV. That way, the pixel values across the boundary are extended so that the continuity of the image and the continuity of the normal derivatives are both preserved at the boundary. Moving along the BCs listed here, the structure of the blurring matrix A changes, becoming more involved. As we will see in the rest of the paper, by using Zero BCs we get a Block Toeplitz with Toeplitz Blocks (BTTB) matrix, by using Periodic BCs we get a Block Circulant with Circulant Blocks (BCCB) matrix, while Reflective and Anti-Reflective BCs give rise to more complex matrix structures [29]. The Zero BCs can be useful for some applications in astronomy, where an empty dark space surrounds a well located object. On the other hand, they give rise to high ringing effects close to the boundary of the restored image in other classical imaging applications, where
123
J Sci Comput
the background is not uniform. Periodic BCs are computational favourable since the matrix A can be easily diagonalized by Fast Fourier Transform (FFT), but in the restoration process they may also generate artifacts along the boundaries. The other two BCs are able to reduce these ringing effects. In particular, Anti-Reflective BCs can be considered the most precise, since no discontinuity in both the image and its normal derivatives is artificially added by the BCs imposed. In all these cases, the matrix-vector product can be done in O(N log N ) by FFT, using a proper pad of the vector in agreement with the BCs imposed and then performing a circulant convolution of double size. In some cases, Reflective and Anti-Reflective BCs are even cheaper, since they require only real operations instead of complex ones [2] without needing any padding. Regardless of the imposed BCs, if we require to deal with positive (semi-)definite matrices, instead of the system (1), we can solve the system of the normal equations A H Ax = A H b,
(2)
where A H is the conjugate transpose of A. This choice allows us to use many iterative methods, such as the Conjugate Gradient (CG) and its generalizations. Moreover, all iterative methods, when applied to the normal equations (2), become more stable, i.e. less sensitive with respect to data noise. Unfortunately, in solving (2) instead of (1), the rate of convergence slows down. In this respect, the conventional technique to speed up the convergence is to consider the preconditioned system D A H Ax = D A H b,
(3)
where D ∈ R N ×N is the so-called preconditioner, whose role is to suitably approximate the (generalized) inverse of the normal matrix A H A [26]. With a careful look, we can say that the classical preconditioning scenario seems to be quite curious, since the preconditioner D has to speed up the slowing down produced by A H . On these grounds, in [8] we proposed a new technique that uses a single preconditioning operator directly applied to the original system (1). The new preconditioner, called reblurring matrix Z , according to the terminology introduced in [12], leads to the new preconditioned system Z Ax = Z b .
(4)
As pointed out in [8], the aim of the preconditioner Z ∈ R N ×N is to allow iterative methods to become more stable (as well as usually obtained through the normal equations involving A H ) without slowing the convergence (so that no subsequent accelerating operator D is needed), especially in the so-called signal space, i.e. the subspace less sensitive to the data noise. We introduced two different mathematical techniques to build the preconditioner Z . The first is based on a coarse version of the PSF, the second on a regularized (i.e. filtered) approximation of the conjugate of the inverse of the eigenvalues of A, approximated by using the generating function of the system matrix A. In both cases, in [8] we apply these techniques to compute a BCCB matrix Z , regardless of the BCs used in the model to define the system matrix A. Basically, the two techniques of [8] allow the preconditioner Z to inherit some information on the spectrum of A, but no information about its structure is used. Moreover, it is well-known by seminal results of [30] that a BCCB preconditioner of a BTTB matrix A cannot be optimal (i.e. the action of the preconditioner on A gives rise to a proper cluster at 1 of the singular values) even for an optimal (i.e. “almost exact”) approximation of its generating function (and this yields that the BCCB preconditioner Z used in [8] will never be optimal). On the other hand, a BTTB preconditioner of a BTTB
123
J Sci Comput
matrix A (i.e. a preconditioner with the same structure of the system matrix) is optimal even when the spectrum of the preconditioner is a poor approximation of the spectrum of the system matrix [28] (in particular, it is enough that all the zeros of the generating function are exactly located with the same multiplicity, regardless the other values of the generating function). Based on these considerations, in this paper we improve the reblurring technique based on the approximation of the conjugate of the eigenvalues of A by defining a class of preconditioners Z endowed with the same structure of the system matrix A. We call this strategy as structure preserving reblurring preconditioning. At the best of our knowledge, the only papers in which the structure of A is preserved in the preconditioner are [9,18,25] for Anti-Reflective, Zero and Reflective BCs, respectively. In all three cases, the results reported heavily depend on the symmetry properties of the PSF. In particular, [18] takes into account a symmetric PSF and analyses preconditioned MR-II (a variant of the minimal residual method sometime also called conjugate residual), building the preconditioner by the well-known circulant padding of Toeplitz matrices. On the other hand, both [9,25] show that the optimal (in the sense of Frobenius norm) preconditioner in a proper algebra diagonalized by a fixed real transform is associated with the symmetrized version of the original PSF. Such preconditioning technique works well when the PSF is close to be symmetric, while has poor performance for strongly non-symmetric PSFs. In this paper we propose a general structure preserving preconditioning technique which can be used for any type of space-invariant blur, like strongly non-symmetric PSFs, and any kind of BCs, also those not based on affine relations, like Synthetic BCs proposed in [16]. Our proposal belongs to the class of reblurring preconditioning (4), which was shown in [8] to be superior to the standard preconditioning (3) studied in [26], and it does not require any padding. We introduce the idea by giving a simple and complete scheme to compute an approximated and regularized version of the inverse of the generating function of the system matrix A, which requires only the application of two FFTs in O(N log N ), where N = n 2 is the dimension of the image. Moreover, in the special case of Zero BCs, some relationships with the preconditioning strategy proposed in [18] are studied in detail. All regularizing preconditioners require the estimation of a further filtering of the spectrum [14,15]. A possible alternative is a non-stationary sequence of thresholding parameters as proposed in [11]. Since the proposal in [11] can be interpreted as a non-stationary variant of a specific reblurring preconditioner among those investigated in [8], in this paper we study also a possible non-stationary generalization of our structured preconditioner. In particular, similarly to the strategy adopted in [6] for low-rank approximations, we estimate the thresholding parameter at every iteration by resorting to a BCCB approximation which allows a simple and cheap computational solution of the corresponding non-linear problem. Finally, according to our numerical tests, we note that the use of a structured preconditioner is useful both to obtain a better restoration and a more robust choice of the regularization parameters with respect to classical BCCB preconditioners. The paper is organized as follows. Section 2 reviews the structure of the blurring matrix in the case of space-invariant blur. Our structured preconditioner is introduced in Sect. 3, where similarities and differences with the approach in [18] are studied in the case of symmetric Toeplitz matrices and Tikhonov filter. A non-stationary version of the structured preconditioner, inspired by the proposal reported in [11], is described in Sect. 4. Section 5 collects some numerical results and related comments. Finally, Sect. 6 is devoted to conclusions and discussion of future works.
123
J Sci Comput
2 Structure of the Blurring Matrix The constitution of the blurring matrix A ∈ Rn ×n of the image deblurring problem (1) is based on two ingredients: the PSF and the BCs enforced in the discretization. As already sketched in the Introduction, the latter choice gives rise to different types of structured matrices. For notational simplicity we consider a square PSF H ∈ Rn×n . We suppose that the position of the PSF centre is known. Thus, H can be depicted in this way ⎞ ⎛ h −m 1 ,−m 2 · · · h −m 1 ,0 · · · h −m 1 ,n 2 ⎟ ⎜ ⎟ ⎜ .. .. .. ⎟ ⎜ .. ⎟ ⎜ . . . . ⎟ ⎜ ⎟ ⎜ h −1,−1 h −1,0 h −1,1 ⎟ , H =⎜ ⎜ h 0,−m · · · h 0,−1 h 0,0 h 0,1 · · · h 0,n 2 ⎟ 2 ⎟ ⎜ ⎟ ⎜ h 1,−1 h 1,0 h 1,1 ⎟ ⎜ ⎟ ⎜ . . . . .. .. .. .. ⎠ ⎝ h n 1 ,0 · · · h n 1 ,n 2 n×n h n 1 ,−m 2 · · · 2
2
where h 0,0 is the central coefficient and m 1 + n 1 + 1 = m 2 + n 2 + 1 = n. Given the pixels h j1 , j2 of the PSF, it is possible to associate the so-called generating function f : R2 → C as follows n1
f (x1 , x2 ) =
n2
n−1
h j1 , j2 eıˆ( j1 x1 + j2 x2 ) =
j1 =−m 1 j2 =−m 2
h j1 , j2 eıˆ( j1 x1 + j2 x2 ) ,
(5)
j1 , j2 =−n+1
where ıˆ2 = −1 and with the assumption that h j1 , j2 = 0 if the corresponding pixel is not detected, i.e. if the element (h j1 , j2 ) does not belong to H [10,18]. Note that h j1 , j2 are the Fourier coefficients of f ∈ n−1 , where k = span{eıˆ( j1 x1 + j2 x2 ) , j1 , j2 = −k, . . . , k}, so that the generating function f contains the same information of H . The choice of the range of the indexes of the second summation in (5), although some elements of the sum are zeros, is useful to obtain a simple form for the BTTB matrix in (6). The most simple boundary conditions are Zero BCs, which assume that the signal is null outside the FOV. This is a good choice when we deal with images having a black background, occurring for instance in astronomical applications. In the other cases, such BCs usually give rise to heavy ringing phenomena near the boundaries. The resulting blurring matrix is a BTTB matrix. We will use the symbol T to denote this class of matrices. In detail, we have ⎛
T0 T−1 ⎜ .. ⎜ . ⎜ T1 Tn ( f ) = ⎜ ⎜ .. . . ⎝ . . Tn−1 · · ·
⎞ · · · T−n+1 .. ⎟ .. ⎟ . . ⎟ , ⎟ ⎟ .. . T−1 ⎠ T1 T0 n 2 ×n 2
⎛
with
h k,0 h k,−1 ⎜ .. ⎜ . ⎜ h Tk = ⎜ k,1 ⎜ .. .. ⎝ . . h k,n−1 · · ·
⎞ · · · h k,−n+1 ⎟ .. .. ⎟ . . ⎟ , (6) ⎟ ⎟ .. . h k,−1 ⎠ h k,1 h k,0 n×n
for k = −n +1, . . . , n −1. We highlight that we do not have fast trigonometric transforms for diagonalizing BTTB matrices. This represents an important drawback in using Zero BCs with filtering methods like classical Tikhonov. On the contrary, for other BCs, such transforms are available. We are talking of Periodic BCs, for which FFT is available to this purpose [21]. These BCs assume that the signal repeats itself periodically in both the two directions. So, considering for simplicity the 1D case, they impose that
123
J Sci Comput
x1− j = xn+1− j
and
xn+ j = x j ,
j = 1, . . . , p,
where p is the parameter related to number of pixels outside the FOV that are taken into account. For multidimensional problems it is enough to apply the same assumption in every direction. The 2D corresponding blurring matrix A is a BCCB matrix. We will use the symbol C to denote this class of matrices. Clearly, it happens quite rarely that the image is periodic outside the FOV, so we often have ringing effects in the restoration. The research of more accurate models has led to define quite sophisticated BCs. A step in this direction is represented by Reflective BCs, which assume that the signal outside the FOV is a reflection of the signal inside the FOV. Formally, in the 1D case, these BCs impose that x1− j = x j
and
xn+ j = xn+1− j ,
j = 1, . . . , p.
The corresponding blurring matrix A is a Block Toeplitz plus Hankel with Toeplitz plus Hankel blocks, which can be diagonalized by Discrete Cosine Transform (DCT) when the PSF is symmetric [25]. We will use the symbol R to denote this class of matrices. AntiReflective BCs assume that the signal outside the FOV is an antireflection of the signal inside the FOV. Formally, in the 1D case, these BCs impose that x1− j = 2x1 − x j+1
and
xn+ j = 2xn − xn− j ,
j = 1, . . . , p.
The corresponding blurring matrix A is a Block Toeplitz plus Hankel with Toeplitz plus Hankel blocks plus a low rank correction, which can be diagonalized by Anti-Reflective Transform (ART) when the PSF is symmetric [29]. We will use the symbol AR to denote this class of matrices. Both for Reflective and Anti-Reflective BCs a fast transform is available, but only if the PSF is quadrantally symmetric, i.e. symmetric in both horizontal and vertical direction. Other BCs, like those proposed in [10,16,31], could be used, but we do not go further in details for the uniformity of presentation, even if our approach can be straightforwardly applied to them too. In the 2D case the same assumption on the image outside the FOV is done firstly in one direction and then in the other direction. For a detailed discussion on BCs and the associated blurring matrices please refer to [13,21]. In summary, about the presented BCs, we have Zero BCs: Periodic BCs: Reflective BCs: Anti-Reflective BCs:
A = Tn ( f ), A = Cn ( f ) = Tn ( f ) + BnC ( f ), A = Rn ( f ) = Tn ( f ) + BnR ( f ), A = ARn ( f ) = Tn ( f ) + BnAR ( f ).
We notice that in all these four cases the blurring matrix A ∈ Rn ×n has a BTTB (Toeplitz) structure Tn ( f ), given by the shift-invariant structure of the continuous operator, plus a correction BnX ( f ), X = C , R, AR depending on the BCs. Even if, as said, in some cases the BCs suffer from the lack of fast trigonometric transforms, for all of them the matrixvector multiplication can be always computed in an efficient way by exploiting the structure A = Tn ( f ) + Bn ( f ). Indeed, the multiplication can be made by means of FFTs (accessing only the PSF) on an appropriately padded image of larger size. In conclusion, we employ the unified notation A = Mn ( f ), where M can be any of the classes of n 2 × n 2 matrices just introduced (i.e. T , C , R, AR). This notation highlights the two crucial ingredients that form A: the blurring phenomena associated with the PSF described by f and the involved BCs represented by M. 2
123
2
J Sci Comput
3 Structure Preserving Reblurring Preconditioning As said in the Introduction, usually iterative regularization methods show a low convergence 2 2 rate. A way to remedy this drawback is to apply a (regularized) preconditioner D ∈ Rn ×n and then to solve the linear system (3) equivalent to (2). With the aim to stabilize iterative methods and at the same time to speed up the convergence rate, in [8] we introduced a single 2 2 preconditioning matrix Z ∈ Rn ×n and replaced the preconditioned normal equations with the reblurring equation (4). This leads to reformulate iterative methods as the Landweber method in the new preconditioning context, that is to replace the following preconditioned iterative scheme xk+1 = xk + τ D A H (b − Axk ) with xk+1 = xk + τ Z (b − Axk ),
(7)
where τ is a positive relaxation parameter. In the following we fix τ = 1, by applying an implicit rescaling of the preconditioned system matrix Z A. Although Z A is not in general symmetric, the convergence of the modified Landweber method (7) can be still assured [8]. A way to build Z is to apply a coarsening technique to the PSF of the problem (see [8]). Another more common way is to use filtering strategies. More in detail, in the case of periodic BCs, Z is built as the BCCB matrix whose eigenvalues v j are obtained from the eigenvalues λ j of A using some filtered inversion [14]. In particular, two very popular filters are the Hanke–Nagy–Plemmons (HNP) filter [19], defined as ⎧ ⎨ λ j /|λ j |2 , if λ j ≥ ζ, vj = j = 1, . . . , n 2 , (8) ⎩ λj, if λ j < ζ, and the Tikhonov filter, defined as λj , v j = 2 λ j + α
j = 1, . . . , n 2 ,
(9)
where α and ζ are positive regularization parameters. Clearly, in case of periodic BCs, since we are in the circulant algebra, the classical preconditioning approach (based on D) and this new one (based on Z ) are the same thing. However, for other BCs, the Z variant shows better performance and higher stability (in relation to the choice of regularization parameters, e.g. α for Tikhonov) than standard preconditioning (see [8]). Thus, for any BCs, Z is built as the BCCB matrix obtained by applying a filtered inversion to the same PSF (i.e. the same generating function f ) that gives rise to the matrix A. In other words, if we call g such filtered function, recalling the notation introduced in Sect. 2, we have that A = Mn ( f ) and Z = Cn (g). In summary, basing on the work done in the circulant environment [8], we want to propose a general algorithm (that works for any PSF and any BCs) for constructing a structured preserving preconditioner Z having the following strengths with respect to D: a) D A H loses the matrix structure, while Z preserves it; b) computational cost of Z variant remains fundamentally the same of the standard iterative method to which variant is applied, while the cost of a preconditioned method based on normal equations is in general higher than the standard one; c) for all filters, Z matrix gives rise to an higher stability than D A H , i.e. iterative methods compute a good restoration even when regularization parameters ζ and α are very small.
123
J Sci Comput
Regardless of the BCs adopted to build the blurring matrix A, in [8] Z is always a BCCB matrix. The main new idea of this paper is to build Z so that it inherits the same matrix structure of A. 2πi j Let F [1] be the discrete Fourier matrix defined as Fi,[1]j = √1n e−ˆı n , for i, j = 0, . . . , n−1 √ and ıˆ = −1. The two-dimensional Fourier matrix is defined by tensor product as F [2] = F [1] ⊗ F [1] . The eigenvalues of Cn ( f ), the n 2 × n 2 BCCB matrix associated with the PSF H , can be computed by applying F [2] to the first column of Cn ( f ), i.e., by means of the two-dimensional FFT of a proper permutation of H , see [21] for details. Moreover, the n 2 eigenvalues of Cn ( f ), can be also written as
ci, j = f
2πi 2π j , n n
,
i, j = 0, . . . , n − 1,
(10)
where f is the generating function (5), see [7]. We can now regularize such eigenvalues. Let us denote by vi, j the values obtained by applying the Tikhonov filter (9) to ci, j , instead of λ j . Since it usually gives very good (and often the best) numerical results, we simply consider the Tikhonov filter (9). However, any other filter could be applied as well. On the ground of the theory of eigenvalues decomposition of BCCB matrices, the values vi, j can be considered as a sampling of the function n+1 2
g(x1 , x2 ) =
β j1 , j2 eıˆ( j1 x1 + j2 x2 )
(11)
j1 , j2 =− n+1 2
at the grid points n = {( 2πi n ,
g
2πi 2π j , n n
2π j n ) | i,
j = 0, . . . , n − 1} (see [10]). Namely,
:= vi, j =
2π j f ( 2πi ci, j n , n ) . = 2π j 2 |ci, j |2 + α | f ( 2πi n , n )| + α
(12)
Note that g is a regularized approximation of the inverse of f on n . The function g is univocally identified by the n 2 interpolation conditions (12), for i, j = 0, . . . , n − 1, and its coefficients β j1 , j2 can be computed by means of a two-dimensional IFFT of these 2π j values g( 2πi n , n ). We will denote by H the mask of the coefficients β j1 , j2 , j1 , j2 = n+1 − 2 , . . . , n+1 . It is worth observing that up to this point the described technique is just 2 like the one proposed in [8]. The main difference between this approach and that described in [8] is that in the latter the function g is used as the symbol of a BCCB matrix, while here we combine g with the BCs of the problem defining a n 2 × n 2 matrix Z := Mn (g) that has the same structure of the original system matrix A = Mn ( f ) of our blurring model our approach can be applied also to BCs that do not (1). Note that using directly the mask H construct explicitly the matrix Z , like Synthetic BCs [16]. The following algorithm summarizes how to build our structure preserving preconditioner.
123
J Sci Comput
Algorithm 1 Structure preserving preconditioner Input: H , BCs n−1 1. get {ci, j }i, j=0 of (10) by computing an FFT of H ci, j ci, j 2 +α
n−1 2. get {vi, j }i, j=0 of (12) by applying Tikhonov filter (9) to ci, j , i.e. vi, j =
of the coefficients β j , j of g of (11) by computing an IFFT of {vi, j }n−1 3. get the mask H 1 2 i, j=0 and BCs 4. generate the matrix Z := Mn (g) from the coefficient mask H Output: Z
Throughout, we refer to the circulant preconditioning technique proposed in [8] as Z circ , while our structure preserving preconditioner will be denoted by Z struct . Note that when the PSF is quadrantally symmetric, i.e. h i, j = h ±i,± j , then the generating function f is a cosine polynomial thanks to the Eulero formula. Therefore, it is worthwhile looking for g in (12) as a cosine polynomial, which implies that Algorithm 1 can be reformulated by replacing FFT with DCT. Furthermore it is interesting to note that, by appropriately changing the filter at step 2, Algorithm 1 can be used for creating a structured preconditioner D (instead of Z ). For instance, considering the linear system (3), D has the role of preconditioning A H A, so a 1 filter that can be used for creating D is the following one: vi, j = . However, when |ci, j |2 +α applied to Landweber method, such preconditioning strategy is worse than Z variant. If we take into account the Conjugate Gradient for Least Squares (CGLS), its preconditioned version (PCGLS) is associated with the following linear system 1
1
1
D 2 A H AD 2 y = D 2 A H b,
1
y = D − 2 x,
where D is a positive definite matrix that has the role of preconditioning A H A. Therefore a 1 filter that can be used for creating D 2 is: vi, j = 1 2 . It is known that classical circulant |ci, j | +α preconditioning applied to CGLS is able to speed up the method, but usually it negatively affects the restoration accuracy. As it can be seen by numerical evidences reported in this work, by using structured preconditioners it is possible to overcome this issue.
3.1 Comparison with Hanke–Nagy Preconditioner For Zero BCs and symmetric PSF we can seek a strict link between the proposed structure preserving preconditioner Z struct and the Toeplitz preconditioning proposed by Hanke and Nagy in [18] for real symmetric Toeplitz systems. For simplicity, in analyzing similarities and differences between these two techniques, we consider the one-dimensional case. Nevertheless, the multidimensional extension is straightforward, as explained in more details at the end of the section. Our aim in this subsection is to study, in a sense that will be further explained, how much the two preconditioners are close. To recognize that the PSF H is symmetric, we fix the central pixel at the centre of H , and, for the sake of simplicity, we assume n even. Therefore, the 1D PSF we consider now is H = [h n2 −1 , . . . , h 0 , . . . , h n2 −1 , 0] and the associated generating function is n
f (x) = h 0 + 2
2 −1
h j cos( j x),
(13)
j=1
123
J Sci Comput
obtaining the 1D n × n blurring matrix A = Tn ( f ) in the Toeplitz case, which is related to Zero BCs (differing from the previous section, we underline that here Tn denotes a n × n Toeplitz matrix). We briefly recall the computation of the preconditioner proposed in [18]. First of all, the matrix Tn ( f ) is embedded in the symmetric circulant 2n × 2n matrix
T (f) R C2n ( f ) = n (14) R Tn ( f ) whose first column is given by [h 0 , . . . , h n2 −1 , 0, . . . , 0, h n2 −1 , . . . , h 1 ]T . Then, the eigenvalues of C2n , computed via FFT, are inverted by means of the HNP regularization filter (8) in order to obtain a regularized inverse of C2n . Finally, the preconditioner is selected as the first n × n principal submatrix of such a 2n × 2n regularized inverse of C2n . The following proposition summarizes our result. Proposition 1 Assume a family {Tn ( f )}n of n × n deblurring matrices with Zero BCs, where f denotes the generating function (13) of a real 1D fully-symmetric PSF. For any matrix Tn ( f ), let Z struct,n denote the associated structure preserving preconditioner of Algorithm 1 and PHN,n denote the associated inverse Toeplitz preconditioner by Hanke and Nagy [18], both regularized by the same Tikhonov filter (9). Then, the two preconditioners Z struct,n and PHN,n are asymptotically equivalent, i.e. the following result holds Z struct,n − PHN,n −→ 0, n→∞
where · denotes the spectral norm of matrices. In particular, Z struct,n − PHN,n = O log(n) e−cn , with c > 0. Proof As first step we explicitly compute the Z struct,n preconditioner of Tn ( f ). Algorithm 1 considers the symmetric circulant matrix Cn ( f ) whose first column is [h 0 , . . . , h n2 −1 , 0, h n2 −1 , . . . , h 1 ]T . Hence, the step 1 of the algorithm computes the eigenvalues of Cn ( f ) that are n
2 −1 2kπ 2 jkπ h j cos ck = f = h0 + 2 , k = 0, . . . , n − 1. n n j=1
At step 2 we adopt the Tikhonov filter (9), so that we compute the regularized inverses vk of the eigenvalues as ck , k = 0, . . . , n − 1. (15) vk = 2 ck + α n−1 = span{cos( j x), j = , k = 0, . . . , n − 1, where g ∈ We can write vk = g 2kπ n 0, . . . , n − 1} is the trigonometric interpolating polynomial on the pairs 2kπ n , vk , k = 0, . . . , n − 1, that is n−1 β j cos( j x). (16) g(x) = β0 + 2 j=1
Actually, it can be easily proved that g, although it depends on n interpolation conditions, has degree n/2. Indeed, by rewriting the interpolation conditions as 2kπ k = 0, . . . , n2 n , vk
123
(n/2+k)π , vn/2+k n/2
k = 1, . . . , n2 − 1
J Sci Comput
and observing that vk = vn−k , for k = 1, . . . , n2 − 1, from (15) and the symmetry of g in (16) n . For this reason in the following with respect to the interval [0, 2π], we have that g ∈ 2 we rename g as g n2 . The steps 3 and 4 build the preconditioner as the matrix that preserves the structure of A and whose symbol is g n2 , hence for Zero BCs, the preconditioner finally is Z struct,n = Tn (g n2 ) .
(17)
Now we consider the PHN,n preconditioner of Tn ( f ). The eigenvalues of the 2n × 2n circulant matrix C2n ( f ) defined by (14) are
kπ 2kπ = f , k = 0, . . . , 2n − 1. μk = f 2n n Applying again the Tikhonov filter (9) (unlike the HNP filter considered in [18]), we obtain the regularized inverses u k of the eigenvalues as follows uk =
μk , k = 0, . . . , 2n − 1. +α
μ2k
By means of an analogous reasoning to that performed before, it can be shown that there exists n that interpolates the 2n pairs kπ , u k , k = 0, . . . , 2n − 1. a unique polynomial h n ∈ n In fact, due to their symmetry, among the previous interpolation conditions only the ones relating to k = 0, . . . , n are distinct. The preconditioner PHN,n is then selected as the first n × n principal submatrix of the 2n × 2n circulant matrix C2n (h n ). Denoted by S the 2n × n matrix whose j th column is the j th element of the canonical basis of R2n , the preconditioner can be expressed as PHN,n = Tn (h n ) = S H C2n (h n )S . (18) We can now relate the two preconditioners Z struct,n = Tn (g n2 ) and PHN,n = Tn (h n ). Let us observe that defining f (x) , f 2 (x) + α n we have that g n2 interpolates ψ on n2 = 2kπ n , k = 0, . . . , 2 and h n interpolates the kπ n , then we can write , k = 0, . . . , n . Since n ⊂ n and h n ∈ same ψ on n = ψ(x) =
n
2
n is such that pn where pn ∈
2kπ n
h n = g n2 + pn = 0, for k = 0, . . . , n2 . On this ground we rewrite pn = E n2 − E n
where E n2 = ψ − g n2 and E n = ψ − h n are the classical remainder in the interpolation of ψ on n2 and n , respectively. By virtue of the linearity of Tn , we have Tn (g n2 ) − Tn (h n ) = Tn (g n2 ) − Tn (g n2 ) − Tn ( pn ) = Tn (E n2 − E n ) ≤ E n2 ∞ + E n ∞ ,
(19)
from well-known properties of Toeplitz matrices, cf. [17], where · denotes the spectral norm of matrices and · ∞ is the classical supremum norm of functions in [0, 2π]. By construction, E n is the remainder function of the interpolation of ψ on the n + 1 Chebyshev– Lobatto nodes defined as cos( kπ n ) for k = 0, . . . , n. Its Lebesgue constant is known to grow as k1 log(n) where k1 is a constant. Thus, E n ∞ can be bounded in the following way
123
J Sci Comput
E n ∞ ≤ k1 log(n) an ∞ , where an ∞ is the error in the best approximation of ψ in the space of polynomials of degree at most n. Applying a similar reasoning to E n2 , we can bound it as E n2 ∞ ≤ k2 log( n2 ) a n2 ∞ . Because of the C ∞ regularity of ψ, ar is exponentially converging to zero as r tends to +∞ by Jackson’s Theorem. In conclusion, from (19) it follows that Tn (g n2 ) − Tn (h n ) = O log(n) e−cn , with c > 0, i.e. the two preconditioners Z struct,n = Tn (g n2 ) and PHN,n = Tn (h n ) are asymptotically equivalent. It is interesting to notice that the equivalence result of Proposition 1 is confirmed by several numerical tests, where basically we got the same deblurring accuracy and the same convergence speed by applying the two preconditioners. On the other hand, from a computational point of view Z struct,n of (17) requires two FFTs of size n instead of two FFTs of size 2n to obtain PHN,n of (18). The proof of Proposition 1 can be extended directly to two-dimensional problems observing that the involved functions are even bivariate trigonometric polynomials and the 2D interpolation grid is simply defined by tensor product.
3.2 Relationships Between Structured and Circulant Preconditioners Now we present a brief theoretical analysis for highlighting the importance of employing structured preconditioners instead of circulant ones. For simplicity and clarity of notations, we consider again the one-dimensional case; the multidimensional case can be similarly studied. Thus, in this section we take into account the following 1D PSF of size n − 1, with n even, H = [h − n2 +1 , · · · , h −1 , h 0 , h 1 , · · · , h n2 −1 ], so that the associated n × n Toeplitz matrix is ⎛ h 0 h −1 · · · h − n2 +1 0 ⎜ .. ⎜ h1 . h − n +1 h 0 h −1 ⎜ 2 ⎜ . .. .. .. .. ⎜ . . . . . ⎜ . ⎜ . ⎜ ⎜ h n −1 . . h 1 h0 h −1 2 TH = ⎜ ⎜ . .. h ⎜ 0 hn h0 1 ⎜ 2 −1 ⎜ . . . . . ⎜ . .. .. .. .. ⎜ . ⎜ ⎜ .. .. ⎝ 0 . . 0 h n −1 2 0 0 ··· 0 h n2 −1 As already said, if A ∈ written in this form
Rn×n
··· 0 ..
.
0 ..
0
.
..
.
..
. h − n +1 2 .. . h −1 .. .. . . h1 ···
h0 h1
⎞
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ 0 ⎟ ⎟ ⎟ h − n2 +1 ⎟ ⎟ .. ⎟ ⎟ . ⎟ ⎟ ⎟ h −1 ⎠ h0 0 .. .
.
n×n
is a blurring matrix built from a PSF H and BCs X , it can be A = T H + BX H,
(20)
where T denotes the Toeplitz part, while B denotes the part relative to the BCs. Now we introduce some definitions. In the following, · F is the Frobenius norm. Definition 1 Given a PSF H , H is called an approximation of H if T H − T H F < T H F
123
J Sci Comput
and if, for any BCs X such that BX H F > 0, X X B X H − B H F < B H F .
Remark 1 We highlight that the request made by Definition 1 is weak, in fact it means that H approximates H better than the zero PSF. This is important because it entails that theoretical results reported in the following are valid not only for good approximations of H (as one might expect), but also for rough approximations of H . Therefore the weakness of Definition 1 gives rise to theoretical results which are very general. Definition 2 Given a matrix A, built from a PSF H and BCs X , and a matrix B, built from a PSF K and BCs Y , the Toeplitz Boundary distance between A and B is defined as Y 2 dTB (A, B) = T H − T K 2F + BX H − BK F . It is easy to verify that dTB is a distance in the space of matrices of the form (20), since it inherits the required properties from the inner Frobenius norm. The only difference between the conventional Frobenius distance A − B F and the TB distance dTB (A, B) is that the former measures the global closeness between the two matrices, while the latter considers separately the Toeplitz part and the part relative to BCs. Proposition 2 In general, dTB (A, B) ≥ A − B F . In particular, if A has Periodic or Zero BCs and if B has Periodic or Zero BCs, then dTB (A, B) = A − B F . Proof The first statement follows from triangle inequality Y 2 Y 2 2 X A − B 2F = T H − T K + BX H − B K F ≤ T H − T K F + B H − B K F
= [dTB (A, B)]2 .
(21)
The second statement holds because for Periodic and Zero BCs the non-zero elements of the Y two matrices T H and BX H (T K and B K for B) are not overlapped, indeed ⎞ ⎛ 0 · · · 0 h n2 −1 · · · h 1 ⎜ .. . ⎟ .. .. .. .. ⎜ . . . . . .. ⎟ ⎟ ⎜ ⎟ ⎜ .. .. .. .. ⎜ 0 . . . . h n −1 ⎟ ⎟ ⎜ T C 2 B H = 0, BH = ⎜ . ⎟ .. .. .. .. ⎟ ⎜ . . . 0 ⎟ ⎜ h − n +1 . ⎟ ⎜ 2 ⎜ . .. ⎟ .. .. .. .. ⎝ .. . . . . . ⎠ h −1 · · · h − n2 +1 0 · · · 0 n×n
Therefore, since Frobenius norm is computed componentwise, the inequality in (21) is simply an equality. Proposition 3 Let A be the matrix built from a PSF H and BCs X and let H be an approximation of H . Defined by C the circulant matrix built from H and C (i.e. Periodic BCs), by T the Toeplitz matrix built from H and T (i.e. Zero BCs) and by S the structured matrix built from H and X , we have that i) if X = C , then S = C and dTB (A, S) = dTB (A, C) < dTB (A, T ),
123
J Sci Comput
ii) if X = T , then S = T and dTB (A, S) = dTB (A, T ) < dTB (A, C), iii) if X = R, AR, then
dTB (A, S) < dTB (A, T ) < dTB (A, C).
Proof In the first case, we have [dTB (A, S)]2 = [dTB (A, C)]2 = T H − T H 2F + BCH − BCH 2F < T H − T H 2F + BCH 2F = [dTB (A, T )]2 , since by hypothesis H is an approximation of H . Similarly, in the second case, we have [dTB (A, S)]2 = [dTB (A, T )]2 = T H − T H 2F < T H − T H 2F + BCH 2F = [dTB (A, C)]2 . In the third case, we get ,AR [dTB (A, C)]2 = T H − T H 2F + BR − BCH 2F H ,AR 2 = T H − T H 2F + BR F + BCH 2F H ,AR 2 > T H − T H 2F + BR F = [dTB (A, T )]2 H ,AR ,AR 2 > T H − T H 2F + BR − BR F H H
= [dTB (A, S)]2 . The second equality holds since in the indices (i, j) in which the matrix BCH is non-zero ,AR (located in the bottom left corner and in the top right corner), the matrix BR is zero, H while in the indices (i, j) in which this latter matrix is non-zero (located in the top left corner and in the bottom right corner), the former matrix is zero. Therefore, since Frobenius norm ,AR ,AR 2 is computed componentwise, we have that BR − BCH 2F = BR F + BCH 2F . H H Moreover, by removing BCH 2F , we get the first inequality. The second inequality holds since, by hypothesis, H is an approximation of H . Proposition 3 is a quite general result. In order to apply it in the context of the (structure preserving) reblurring preconditioning, one aspect that needs to be clarified is that, even if both Z circ and Z struct can be written in the form (20), that is Z circ = T H˜ + BCH˜ ,
Z struct = T H˜ + BX , H˜
(22)
now H˜ is not properly a PSF, but a coefficient mask. In spite of this, Definitions 1-2 are still valid, once such a modification has been taken into account. Now our aim is to study preconditioning and to analyse our structure preserving proposal, motivating why we think it can be considered a suitable choice. Thus, we take into account a structured blurring matrix A of the form (20) and we choose a filtered inversion process FIP [14,15], which involves both the choice of the inversion filter and the setting of the regularization parameter. In general, a FIP is a family of regularized approximations of the inverse function, that is, in the simplest real case, a family of piecewise continuous bounded functions that approximate the unbounded inverse function 1/λ for λ > 0. The “level” of approximation is tuned by a regularization parameter, as in the theory of Tikhonov
123
J Sci Comput
regularization, which allows us to choose a certain filtered inverse among all the functions of the family, with respect to the filtering level we need. The eigenvalues of the n × n structured preconditioner P are given by the application of a FIP to the eigenvalues of the n × n structured blurring matrix A. If A is a positive semidefinite Hermitian matrix, simple FIPs come from the approximation of the real function 1/λ by a family of neighbouring functions, which are piecewise continuous over the closure [0, A ] of the spectrum of A (cfr. Definition 1 of [15]). In the general case we are interested in, we generalize to complex numbers the same idea, as done in (8) and (9) for the HNP and Tikhonov two-dimensional filtering, by considering the conjugate values. Thus, we seek a matrix P such that i) the eigenvalues λˆ j of P are obtained by applying a FIP to the eigenvalues λ j of A, for j = 1, . . . , n ; ii) the eigenvalues of the (preconditioned) matrix P A are λˆ j λ j , for j = 1, . . . , n . We call such matrix P a FIP-based preconditioner of A. It is important to remark that the eigenvalues of the preconditioned matrix P A are real, since the FIP computes approximations of the conjugate of the inverse of the complex eigenvalues of A. As instance, in the case of HNP filtering (8), λˆ j λ j = λ j λ j /|λ j |2 = 1 , for λ j ≥ ζ , and λˆ j λ j = λ j λ j = |λ j |2 , for λ j < ζ ; in the case of Tikhonov filtering (9), λˆ j λ j = |λ j |2 /(|λ j |2 + α). On these grounds, we introduce the next theoretical result. Proposition 4 For any matrix A and any filtered inversion process FIP, there exists a FIPbased preconditioner P of A. Proof Let A = Q −1 U Q be a decomposition of A based on a unitary matrix Q, where U is an upper-triangular matrix. The existence of such expression is guaranteed by Schur decomposition, which on the other hand does not guarantee the uniqueness. We can also write U = + N , where N is a strictly upper-triangular matrix and is a diagonal matrix ˆ + N )Q, where ˆ is the diagonal composed of eigenvalues of A. We define P = Q −1 ( matrix with elements λˆ j obtained by applying FIP to any eigenvalue λ j of . Hence λˆ j ˆ + N )Q, where N = N + N ˆ + N2 are the eigenvalues of P and P A = Q −1 (
ˆ is a diagonal matrix, so the eigenvalues of the is a strictly upper-triangular matrix and
preconditioned matrix P A are λˆ j λ j . Even if the last proposition shows how to construct P, usually this procedure is inapplicable in practice or computationally expensive. Moreover, in general P does not have a familiar ˆ matrix structure, i.e. it cannot be written as P = T Hˆ +BXˆ , for a certain coefficient mask Hˆ and H certain BCs Xˆ . However, it is worth to note that, in some special cases, we can quickly compute the eigenvalues of A by applying a trigonometric transform on H , then apply a FIP to λ j and finally obtain Hˆ by the inverse trigonometric transform. For instance, in the case of Periodic ˆ = T ˆ + BC . In BCs, we have A = F −1 F, where F is the Fourier matrix, P = F −1 F H Hˆ case of quadrantally symmetric PSF, analogous results hold for Reflective or Anti-Reflective BCs, and we have P = T Hˆ + BRˆ ,AR . Basing on these observations, we want to generalize H this approach to all the other cases, defining a structured preconditioner linked with the filtered inversion process chosen. Therefore we seek a matrix Z¯ = T H¯ + BX such that, for H¯ any FIP-based preconditioner P of A and every matrix Z of the form T K + BX K, P0 − Z¯ F ≤ P − Z F ,
(23)
123
J Sci Comput
where P0 is a certain FIP-based preconditioner of A. We call such n ×n matrix Z¯ a FIP-based structured preconditioner of A. Again, H¯ is difficult to compute, so we have to search an approximation of it. This is exactly the purpose of Algorithm 1, in which H˜ is computed moving to the circulant algebra and making use of the FFT. In conclusion, by using Proposition 3 (considering H¯ as H , H˜ as H , Z¯ as A, Z circ as C, Z struct as S) we get the next corollary, which states (under the assumption that H˜ is an approximation of H¯ ) that our structure preserving preconditioner is more accurate than the circulant one in the sense of the Toeplitz Boundary distance of Definition 2. Corollary 1 For Zero, Reflective and Anti-Reflective BCs, fixed a filtered inversion process FIP, let Z¯ be a FIP-based structured preconditioner (built from H¯ ), let H˜ be an approximation of H¯ , let Z struct and Z circ be respectively the structured and the circulant preconditioner (both built from H˜ ). Then dTB ( Z¯ , Z struct ) < dTB ( Z¯ , Z circ ). Concerning the spectral clustering provided by our structured preconditioner, if A is invertible, it is optimal in the noise free case choosing α = 0, since it reduces to the proposal in [28]. In the noisy case, a positive filtering parameter α > 0 prevents to match the zeros of the symbol f and hence the preconditioner cannot be optimal. Anyway, α specifies the “level” of the approximation of the inverse and hence the convergence speed of the method.
4 Non-Stationary Preconditioning When we deal with a stationary regularization method we have always to face the nontrivial task of determining a good choice for the filtering parameter α. In [11] the authors proposed a non-stationary version of the Z reblurring preconditioner where the parameter α is dynamically estimated at every iteration instead of to be fixed a-priori. The iteration is the following k xk+1 = xk + Z circ rk ,
k Z circ = C ∗ (CC ∗ + αk I )−1 ,
rk = b − Axk ,
(24)
where C = Cn ( f ) is the BCCB matrix associated with H . Note that if αk = α, then the iteration (24) is exactly (7) with Z = Cn (g), where g is defined in (11) imposing (12) as in k is simply the BCCB matrix obtained by Algorithm 1 with α in place of α and with [8]. Z circ k periodic BCs. In [11], denoting by · the Euclidean norm, the iteration-dependent regularization parameter αk is obtained by solving the following non-linear equation k rk = qk rk , rk − C Z circ
0 < qk < 1,
(25)
with a few steps of the Newton iteration. Here the parameter qk depends on the noise level and it is related to a value 0 < ρcirc < 1/2 which should be as small as possible and satisfies the assumption (C − A)z ≤ ρcirc Az , ∀ z ∈ Rn . (26) This parameter ρcirc , which must be set in the algorithm, is associated to the relative distance between A and C. If A is well approximated by its BCCB counterpart C, then ρcirc can be chosen as a very small value. Indeed, if C is very close to A than C − A is almost the null matrix. Of course also a larger ρcirc satisfies the assumption (26), but, since ρcirc influences the stopping criteria (30), an unnecessary large ρcirc could stop the iterations too early when the
123
J Sci Comput
reconstruction is not yet enough accurate (see [11] for more details). From a numerical point of view, the parameter ρcirc is usually chosen among the values {10−1 , 10−2 , 10−3 , 10−4 }. A too small ρcirc can be easily recognized looking at the sequence of the regularization parameters. The iterations are stopped under a special choice of the discrepancy principle that will be discussed later. From a theoretical point of view, in [11] it is proved that under the assumptions (25) and (26), the iteration (24) converges monotonically and it is a regularization method. In this section we extend the non-stationary iteration (24) to take into account the BCs of the problem following our structure preserving strategy. More precisely, we consider the following iteration k xk+1 = xk + Z struct rk , rk = b − Axk , (27) k where Z struct is the structure preserving matrix built by means of Algorithm 1 with a special choice of the regularization parameter αk . In practice, we would estimate αk by solving k rk = qk rk , rk − AZ struct
(28)
which is not computationally practicable. Therefore, we estimate the regularization parameter αk using an approximation of (28) easily computable that is the equation (25). A similar strategy is used in [6] for different regularization methods. Note that the two iterations (24) and (27) retrieve a different sequence of αk , even if both use the equation (25) to estimate αk , because the sequences of the residuals {rk }k in the two iterative schemes are different. Furthermore, we have to observe that for such a modified version of (24) the condition (26) does not make sense and then we are allowed to introduce a new parameter ρstruct whose value k could not match with the one of ρcirc . Note that since Z struct provides a better approximation k ∗ ∗ −1 of A (A A + αk I ) with respect to Z circ , it is expected that ρstruct < ρcirc . On the other hand, we cannot prove convergence results as in [11], even if the numerical results in Sect. 5 show that our structured non-stationary preconditioning is robust and very effective. As shown in [11], another suitable choice of the parameter αk for iteration (24) is given by the geometric sequence αk = αq k , k = 0, 1, . . . , (29) where α > 0 and 0 < q < 1. In the next section we confirm the effectiveness of the iteration (27) with both sequences of {αk } obtained by (25) or (29).
5 Numerical Results Now we provide some numerical tests for the image deblurring problem of type (1). In each example we impose appropriate BCs and solve the linear system Ax = b using both stationary and non-stationary preconditioned iterations (7) and (27). More in detail, in the stationary case, fixed few values of the parameter α, we compare the performance of the preconditioner Z struct with Z circ . In the non-stationary case our attention is focused on the k k . Regarding the sequence {α } of the comparison between preconditioners Z struct and Z circ k regularization parameters, we investigate the behaviour of both geometric sequence defined in (29) (labeled ‘geometric’ in the following) and sequence computed solving (25) (throughout, labeled ‘DH’, for Donatelli-Hanke). For geometric sequence we fix α = 0.5 and q = 0.7 in (29), as suggested in [11]. Furthermore, in our tests we consider also CGLS and PCGLS, the latter employing a structured preconditioner created by means of Algorithm 1 with a proper filter, as described in Sect. 3. The choice of α related to this filter, which in general is a problematic issue, is made manually.
123
J Sci Comput
When applied to ill-posed problems, like image deblurring, many iterative methods exhibit a semi-convergence behavior with respect to the relative error computed on the true solution (the true image). The term semi-convergence means that the relative error begins to decrease until a certain “optimal” iteration is reached and then begins to increase because of the presence of noise η in Eq. (1). By stopping the iterations when the error is low, we obtain a regularized approximation of the solution. At this regard, we use the well-known discrepancy principle stopping rule, which stops the iterative methods at the first iteration m that satisfies the condition Axm − b ≤ γ η , (30) where xm is the approximation provided by the method at the m-th iteration and γ > 1. We fix γ = 1.01 except for the DH technique where γ = (1 + 2ρcirc )/(1 − 2ρcirc ) or k or γ = (1 + 2ρstruct )/(1 − 2ρstruct ) accordingly to the choice of the preconditioner (Z circ k Z struct ) for the corresponding iterations (24) and (27). Note that the discrepancy principle usually stops the iterations just before the best restoration (associated with the highest PSNR). The initial guess x0 is always taken as the observed image b. Assuming to know the true greyscale s × s image with pixels scaled in [0, 1], let x be its stacking order vector of size s 2 , we measure the quality of the k-th iteration xk by the Peak Signal-to-Noise Ratio (PSNR) defined as s PSNR = 20 log10 . (31) x − xk We refer to the maximum PSNR and to the PSNR corresponding to the discrepancy principle iteration as PSNRmax and PSNRdiscr , respectively. Moreover, we take into account the number of iterations and we mainly focus on preconditioned Landweber methods. At this regard, we stress that the computational cost of classical Landweber and Z -Landweber is the same; the only additional cost is due to construction of Z , which can be made by means of two FFTs (see Algorithm 1). Moreover, Landweber usually is a very slow method, which takes a large number of steps for converging to its best reconstruction (see for instance [8]). On these grounds, in the following we will present several numerical tests, which have been developed with MATLAB.
5.1 Example 1 We start with the deblurring problem reported in Fig. 1. The PSF is a motion blur and it is highly non-symmetric; the noise level is 1%. As usually done, we take a larger image and then we crop it (see the white box in Fig. 1a) to simulate actual blurring. In this example we impose Reflective BCs. In Table 1 we compare Z struct , Z circ and Z DCT preconditioners for α = 0.5, 0.1, 0.05, 0.01. Here Z DCT is a structured preconditioner as Z struct , but unlike Z struct , Z DCT is built from a symmetrized version of the original PSF, which allows to get the eigenvalues by DCT. As expected, since the PSF is highly non-symmetric, Z DCT shows very poor performance, so we do not consider it further in this example. We observe that both PSNRmax and PSNRdiscr provided by Z struct preconditioner are bigger than the PSNRmax obtained using the Z circ one. Furthermore, the discrepancy principle does not work for Z circ preconditioner when α = 0.5, 0.1, 0.05 (shown as − symbol in Table 1). Figures 2a, b show the PSNR versus the iteration number of the non-stationary preconk k preconditioners, for both ditioning techniques (27) and (24), that is, with Z struct and Z circ geometric and DH sequences. In Fig. 2a the two (dashed) lines relative to the DH approach end as soon as the discrepancy principle (30) is satisfied. Indeed, for the DH approach, the parameter αk cannot be computed when (30) holds. On the other hand, in the same figure,
123
J Sci Comput
0 2 4 6 8 10 12 14 16
0
5
(a)
10
15
(b)
(c)
Fig. 1 Example 1: a true image; b PSF; c blurred and noisy image Table 1 Example 1: PSNRmax and PSNRdiscr and corresponding iterations (in parentheses) for Z struct , Z circ and Z DCT preconditioners α = 0.5
α = 0.1
PSNRmax
PSNRdiscr
PSNRmax
PSNRdiscr
α = 0.05
α = 0.01
PSNRmax PSNRdiscr
PSNRmax PSNRdiscr
25.69 (46) 25.62 (34) 25.78 (10) 25.75 (8)
25.82 (5)
25.82 (5)
25.81 (1)
25.69 (2)
Z circ
25.27 (31) –(–)
25.36 (7)
–(–)
25.45 (4)
–(–)
25.61 (1)
25.24 (2)
Z DCT
23.18 (4)
23.31 (2)
–(–)
23.48 (1)
–(–)
23.70 (1)
–(–)
–(–)
26
26
25
25.5
24
25
PSNR
PSNR
Z struct
23 22
24.5 24
21
23.5
20
23
19
0
2
4
6
8
iterations
(a)
10
12
14
22.5
0
5
10
15
20
25
30
35
iterations
(b)
k k geometric (solid geometric (solid blue line), Z circ Fig. 2 Example 1: a Comparison between PSNRs for Z struct k k DH (dashed red line) for ρ −2 and ρ −1 red line), Z struct DH (dashed blue line) and Z circ = 10 struct circ = 10 ; k k DH (dashed red line) for ρ −3 and DH (dashed blue line) and Z circ = 10 b Comparison between Z struct struct ρcirc = 10−2 . Key to symbols: (◦) optimal iteration, () discrepancy iteration
the two (solid) lines relative to the geometric sequence show the classical semi-convergence behaviour, with the indications of the optimal iteration (see ◦), corresponding to the PSNRmax , and of the iteration m in (30) where the discrepancy principle is satisfied (see ). k context, we fix ρ −1 In the non-stationary Z circ circ = 10 . Such a choice of ρcirc is due to k the fact that for smaller values of this parameter, the Z circ method does not converge (see k with DH sequence in Fig. 2b in which ρ −2 the behaviour of the PSNR for Z circ circ = 10 ). k Regarding the parameter ρstruct , we use both 10−2 and 10−3 and observe that the Z struct
123
J Sci Comput k k Table 2 Example 1: PSNRmax , PSNRdiscr and corresponding iterations (in parentheses) for Z struct , Z circ preconditioners for both geometric sequence and DH sequence and for CGLS and PCGLS. We fix ρstruct = 10−2 and ρcirc = 10−1
Geometric
DH k Z struct
CGLS
PCGLS
k Z struct
k Z circ
k Z circ
PSNRmax
25.86 (9)
25.24 (8)
n/a
n/a
25.34 (9)
25.45 (2)
PSNRdiscr
25.75 (8)
–(–)
25.70 (6)
24.58 (3)
25.07 (12)
25.10 (3)
preconditioner works well for both choices of ρstruct . Note that the previous behaviour of the parameters ρcirc and ρstruct agree with what pointed out in Sect. 4. This highlights an k appreciable stability of our algorithm with respect to the parameter ρstruct . Although Z struct is slightly more accurate for ρstruct = 10−3 (PSNRdiscr = 25.75) than for ρstruct = 10−2 (PSNRdiscr = 25.70) (compare also Fig. 2a with Fig. 2b), we decide for the last one since in this case the non-stationary structure preserving method is faster (6 iterations rather than 10 iterations). k k can be appreciated again in Table The better performance of Z struct with respect to Z circ 2 (within ‘n/a’ means that the corresponding PSNRmax is not available). In such table are also reported results relative to CGLS and PCGLS (with α = 0.05), which is able to get restorations of accuracy slightly better than CGLS in less steps. It is clear that not only the k PSNRmax , but also the PSNRdiscr of both Z struct algorithms (geometric and DH) are bigger k . We stress that the discrepancy printhan the PSNRmax provided by CGLS, PCGLS and Z circ k ciple does not work for the Z circ preconditioner with the geometric sequence. A comparison between the restorations in Fig. 3a, e and the ones in Fig. 3c, g highlights the effectiveness of the proposed non-stationary structure preserving preconditioners with respect to the corresponding circulant ones. In particular, looking at Fig. 3b, d, f and h, which report the top-right corner of each restoration, we can see that reconstructions relative to circulant approach have an higher error along the boundary.
5.2 Example 2 The second example is the deblurring problem reported in Fig. 4. The PSF is not far from being symmetric and the noise level is 0.2%. As usually done, we take a larger image and then we crop it (see the white box in Fig. 4a) to simulate actual blurring. We choose to employ Anti-Reflective BCs. In Table 3, we fix α = 0.01, 0.005, 0.001, 0.0005 and show the PSNRmax , PSNRdiscr and corresponding iterations provided by the scheme (7) with Z circ , Z struct and Z ART preconditioners. Here Z ART is the preconditioner built from a symmetrized version of the original PSF making use of the Anti-Reflective transform (ART). We notice that in most cases (the only exceptions are those represented by Z struct for α = 0.01, 0.005) the discrepancy principle does not work, probably because of the very low level of noise that characterizes the problem. As in Example 1, the PSNRs relative to Z struct are bigger than all the other ones. However, unlike Example 1, since we are in the Anti-Reflective famework, the preconditioner Z ART (built from the symmetrized version of the original PSF) shows good performance, while Z circ gives rise to low quality restorations (especially for small values of α). These facts can be visually appreciated by looking at the restored images reported in Fig. 5.
123
J Sci Comput Fig. 3 Example 1: a Discrepancy reconstruction with k Z struct geometric and b zoom of the top-right corner; c Optimal k reconstruction with Z circ geometric and d zoom of the top-right corner; e Discrepancy k reconstruction with Z struct DH and f zoom of the top-right corner; g Discrepancy k DH and reconstruction with Z circ h zoom of the top-right corner
As shown in Table 4, for the two sequences of filtering parameters α (geometric and DH) k k and Z k we have that Z struct , Z circ ART exhibit a similar behaviour in terms of speed, however k Z circ reaches a lower accuracy. We stress that, except DH strategy, in all the other cases the discrepancy principle fails. Even if CGLS is the slowest and the least accurate among the methods taken into account, its performance can be greatly improved by the use of a suitable
123
J Sci Comput
(a)
(b)
(c)
Fig. 4 Example 2: a true image; b PSF; c blurred and noisy image
Table 3 Example 2: PSNRmax and PSNRdiscr and corresponding iterations (in parentheses) for Z struct , Z circ , Z ART preconditioners α = 0.01 PSNRmax
α = 0.005 PSNRdiscr
PSNRmax
PSNRdiscr
α = 0.001
α = 0.0005
PSNRmax PSNRdiscr
PSNRmax PSNRdiscr
Z struct
20.91 (61) 20.84 (87) 20.91 (31) 20.81 (47) 20.89 (6)
–(–)
20.87 (3)
Z circ
20.39 (40) –(–)
19.92 (5)
–(–)
19.86 (1)
–(–)
19.94 (1)
–(–) –(–)
Z ART
20.68 (58) –(–)
20.69 (29) –(–)
20.74 (6)
–(–)
20.77 (4)
–(–)
structured preconditioner. Indeed PCGLS (with α = 0.0004) in this case is competitive with k k Z struct and Z ART . The efficacy of these two strategies appears to be very similar. In particular, the first method is the best for geometric sequence, while the second is the best for DH k sequence. About this sequence in relation to Z ART , we underline that in this example it is necessary to choose ρART very carefully (in particular equal to 5 × 10−2 ) in order to obtain an effective method. Indeed, moving away from that value, you get soon a significant loss k of accuracy. In general, in case of “almost” symmetric PSFs, we may observe that Z ART k k (Z DCT for problems with Reflective BCs) exceeds the proposed Z struct in terms of accuracy. Surprisingly enough, this seems to be caused just in small part by the different transforms k k k involved (i.e. FFT for Z struct , ART and DCT for Z ART and Z DCT , respectively), while the main role seems to be played by the symmetrization process, which has a positive effect for PSFs close to symmetry (while it has a negative effect for strongly non-symmetric PSFs, as observed in Example 1). Therefore, in order to exploit this possibility, one may think to add in Algorithm 1 a symmetrization step that is on or off depending on the symmetry properties of the PSF at hand.
5.3 Total Variation Methods For the sake of completeness, and in order to show the efficacy of our proposal with respect to well-known techniques, we recall an approach which holds an important place in image restoration, namely Total Variation (TV) [27]. In particular we focus on TV regularization models for deblurring, which have been extensively studied (see for instance [1,23,24]). For making tests and comparing numerical results, here we consider Fast Alternating Minimization algorithm for Total Variation deblurring (FAMTV) [3]. Unlike most of TV strategies,
123
J Sci Comput
Fig. 5 Example 2: a Optimal reconstruction with Z struct ; b Optimal reconstruction with Z circ ; c Optimal reconstruction with Z ART k k , Table 4 Example 2: PSNRmax , PSNRdiscr and corresponding iterations (in parentheses) for Z struct , Z circ k Z ART preconditioners for both geometric sequence and DH sequence and for CGLS and PCGLS. We fix ρstruct = 10−2 , ρcirc = 10−1 and ρART = 5 × 10−2
Geometric k Z struct
DH k Z circ
k Z ART
k Z struct
PSNRmax 20.91 (20) 20.36 (19) 20.73 (20) n/a PSNRdiscr –(–)
–(–)
–(–)
CGLS k Z circ
k Z ART
n/a
n/a
PCGLS
19.98 (116) 20.71 (24)
20.69 (16) 20.29 (12) 20.78 (10) –(–)
–(–)
Fig. 6 a FAMTV (with Reflective BCs) optimal reconstruction relative to Example 1 (PSNR = 24.02); b FAMTV (with Anti-Reflective BCs) optimal reconstruction relative to Example 2 (PSNR = 20.52)
which are based on FFT, so they must employ Periodic BCs, FAMTV allows to deal with Reflective and Anti-Reflective BCs and also with non-symmetric PSFs. Therefore we think it is a good choice for a fair comparison with the methods proposed in this paper. In Fig. 6 we report the best reconstructions obtained by FAMTV for Example 1 and Example 2. In accordance with the previous tests, in the first case we employ Reflective BCs, while in the second one we employ Anti-Reflective BCs. We underline that, in order to get a good restoration, FAMTV needs that its regularization parameter α is properly set (see [3] for details). We choose the best α, i.e. the one that gives the highest PSNR, by trial and error, testing several values of α. Looking at PSNR results, as reported in the captions of Fig. 6a, b, we notice that in Example 1 FAMTV gets a PSNR (24.02) bigger than Z DCT (23.70) but smaller than Z circ (25.61) and Z struct (25.82) (see Table 1), while in Example 2 it gets a PSNR (20.52) bigger than Z circ (20.39) but smaller than Z ART (20.77) and Z struct (20.91)
123
J Sci Comput
(see Table 3). Therefore, even if further tests and investigations would be needed, we can say that our proposal (based on Z struct ) is competitive with state-of-the-art techniques.
6 Conclusions In this paper we have considered iterative methods for image restoration and we have proposed a preconditioning technique which preserves the structure of the blurring matrix, which is determined by the BCs imposed in the problem. The presented preconditioning strategy represents a generalization and an improvement with respect to both circulant [8,19] and structured preconditioning available in the literature [9,18,25]. Moreover the proposal has been further extended to provide a non-stationary preconditioning in the same spirit of a recent proposal for BCCB matrices [11]. In particular, the proposed new structured preconditioning gives rise to more accurate restorations and shows higher stability and robustness with respect to the BCCB approach since iterative methods compute good results even when the regularization parameters are very small. We want to underline that in the present paper several ideas, such as Z variant [8] and non-stationary preconditioning [11], are combined and analysed both from theoretical and application viewpoint. Moreover the algorithm proposed for creating the structure preserving preconditioner is based on an idea which is powerful in its generality. In fact it can be applied to any BCs (not only to Reflective or Anti-Reflective ones), any filter (not only employing Tikhonov one) and any trigonometric matrix algebra. About future research lines, we mention that, in order to preserve edges or to enforce sparsity in a certain bases (usually wavelet) on the restored image, regularization terms that lead to non-linear problems are usually employed. Nevertheless, the resulting numerical methods usually require the solution of a regularized least square problem, e.g. the linearized Bregman splitting [5,22]. Therefore, improvements in classical least square methods can be useful also for these more computationally challenging models as shown in [4], where among other strategies the reblurring preconditioner (see [8]) and its non-stationary variants (see [11]) have been adapted to be included in the linearized Bregman splitting for the synthesis approach proposed in [5]. The structure preserving preconditioners proposed in this paper could be similarly considered to improve such numerical methods. This will be subject of further studies. Acknowledgements This work is partly supported by PRIN 2012 N. 2012MTE38N and GNCS-INdAM.
References 1. Almeida, M., Figueiredo, M.: Deconvolving images with unknown boundaries using the alternating direction method of multipliers. IEEE Trans. Image Process. 22(8), 3084–3096 (2013) 2. Aricò, A., Donatelli, M., Serra Capizzano, S.: Spectral analysis of the anti-reflective algebra. Linear Algebra Appl. 428, 657–675 (2008) 3. Bai, Z.-J., Cassani, D., Donatelli, M., Serra Capizzano, S.: A fast alternating minimization algorithm for total variation deblurring without boundary artifacts. J. Math. Anal. Appl. 415, 373–393 (2014) 4. Cai, Y., Bianchi, D., Donatelli, M., Huang, T.Z.: Regularization preconditioners for frame-based image deblurring with reduced boundary artifacts. SIAM J. Sci. Comput. 38(1), B164–B189 (2016) 5. Cai, J.F., Osher, S., Shen, Z.: Linearized Bregman iterations for frame-based image deblurring. SIAM J. Imag. Sci. 2(1), 226–252 (2009) 6. Chung, J., Kilmer, M., Leary, D.O.: A framework for regularization via operator approximation. SIAM J. Sci. Comput. 37(2), B332–B359 (2015) 7. Davis, P.J.: Circulant Matrices. Wiley, New York (1979)
123
J Sci Comput 8. Dell’Acqua, P., Donatelli, M., Estatico, C.: Preconditioners for image restoration by reblurring techniques. J. Comput. Appl. Math. 272, 313–333 (2013) 9. Dell’Acqua, P., Donatelli, M., Serra Capizzano, S., Sesana, D., Tablino Possio, C.: Optimal preconditioning for image deblurring with Anti-Reflective boundary conditions. Linear Algebra Appl. 502, 159–185 (2016) 10. Donatelli, M.: Fast transforms for high order boundary conditions in deconvolution problems. BIT 50(3), 559–576 (2010) 11. Donatelli, M., Hanke, M.: Fast nonstationary preconditioned iterative methods for ill-posed problems, with application to image deblurring. Inverse Probl. 29, 095008 (2013) 12. Donatelli, M., Serra Capizzano, S.: Anti-reflective boundary conditions and re-blurring. Inverse Probl. 21(1), 169–182 (2005) 13. Donatelli, M., Serra Capizzano, S.: Antireflective boundary conditions for deblurring problems. J. Electr. Comput. Eng. (2010) Article ID 241467, 18 pages (survey) 14. Estatico, C.: A classification scheme for regularizing preconditioners, with application to Toeplitz systems. Linear Algebra Appl. 397, 107–131 (2005) 15. Estatico, C.: Regularization processes for real functions and ill-posed Toeplitz problems. In: Kaashoek, M., van der Mee, C., Seatzu, S. (eds.) Operator Theory: Advances and Applications, vol. 160, pp. 161– 178. Birkhäuser Verlag, Basel (2005) 16. Fan, Y., Nagy, J.: Synthetic boundary conditions for image deblurring. Linear Algebra Appl. 434(11), 2244–2268 (2011) 17. Gray, R.M.: Toeplitz and circulant matrices: a review. Found. Trends Commun. Inf. Theory 2(3), 155–239 (2006) 18. Hanke, M., Nagy, J.: Inverse Toeplitz preconditioners for ill-posed problems. Linear Algebra Appl. 284, 137–156 (1998) 19. Hanke, M., Nagy, J., Plemmons, R.: Preconditioned iterative regularization for ill-posed problems. Numerical Linear Algebra, Proceedings of the Conference in Numerical Linear Algebra and Scientific Computation, pp. 141–163. de Gruyter, Kent, 13–14 March 1992 (1993) 20. Hansen, P.C.: Rank-Deficient and Discrete Ill-Posed Problems. SIAM, Philadelphia (1998) 21. Hansen, P.C., Nagy, J., O’Leary, D.P.: Deblurring Images Matrices. Spectra and Filtering. SIAM Publications, Philadelphia (2005) 22. Huang, J., Donatelli, M., Chan, R.: Nonstationary iterated thresholding algorithms for image deblurring. Inverse Probl. Imaging 7(3), 717–736 (2013) 23. Ma, L., Yu, J., Zeng, T.: Sparse representation prior and total variation-based image deblurring under impulse noise. SIAM J. Imaging Sci. 6(4), 2258–2284 (2013) 24. Ma, L., Zeng, T.: Image deblurring via total variation based structured sparse model selection. J. Sci. Comput. 67(1), 1–19 (2016) 25. Ng, M., Chan, R., Tang, W.C.: A fast algorithm for deblurring models with Neumann boundary conditions. SIAM J. Sci. Comput. 21, 851–866 (1999) 26. Piana, M., Bertero, M.: Projected Landweber method and preconditioning. Inverse Probl. 13(2), 441–463 (1997) 27. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 60(1), 259–268 (1992) 28. Serra Capizzano, S.: Preconditioning strategies for asymptotically ill-conditioned block Toeplitz systems. BIT 34(4), 579–594 (1994) 29. Serra Capizzano, S.: A note on anti-reflective boundary conditions and fast deblurring models. SIAM J. Sci. Comput. 25(3), 1307–1325 (2003) 30. Serra Capizzano, S., Tyrtyshnikov, E.: Any circulant-like preconditioner for multilevel matrices is not superlinear. SIAM J. Matrix Anal. Appl. 21(2), 431–439 (1999) 31. Zhou, X., Zhou, F., Bai, X., Xue, B.: A boundary condition based deconvolution framework for image deblurring. J. Comput. Appl. Math. 261, 14–29 (2014)
123