c Pleiades Publishing, Ltd., 2010. ISSN 0005-1179, Automation and Remote Control, 2010, Vol. 71, No. 2, pp. 339–351. c M.A. Scherbakov, 2010, published in Avtomatika i Telemekhanika, 2010, No. 2, pp. 179–191. Original Russian Text
ESTIMATION AND FILTERING
Designing Pareto Optimal Nonlinear Filters for Image Processing M. A. Scherbakov Penza State University, Penza, Russia Received March 2, 2009
Abstract—We consider a method of designing multidimensional polynomial filters characterized by an interval of a discrete functional Volterra series. This method allows designing Pareto optimal nonlinear filters by several criteria. We prove that finding a set of Pareto optimal alternatives is equivalent to minimizing a weighted target function. The designed nonlinear filter is characterized by the plane tangent to a convex optimal solving function, whose curvature is determined by how contradictory the given criteria are. We show examples of polynomial filter design for image processing and compare them with known filters of the same class. DOI: 10.1134/S000511791002013X
1. INTRODUCTION The present paper studies polynomial filters [1–3] whose output signal can be represented as a bounded discrete Volterra series. Using nonlinear filters of this class is held back by the lack of theoretically sound techniques of computing them. In [4–7], the authors presented special case techniques for quadratic and cubic Volterra filters based on several plausible assumptions. These results allowed for a deeper understanding of polynomial filters and for constructing operators that were simple yet efficient enough for real life image processing tasks. Another well-known approach [8] is based on optimization methods that allow to design nonlinear filters that minimize a certain target function, which characterizes how close the designed filter in to the ideal. This approach, although it allows for computing filter coefficients that are optimal for a certain criterion, leaves out its physical meaning and does not answer any questions about the inner structure and properties of the designed filter. Besides, the optimized filter is optimal only with respect to one selected criterion and may turn out to be inefficient if the problem setting changes even slightly. In practice, designing nonlinear filters for image processing usually forces one to consider a number of contradictory requirements, i.e., in effect solve a multicriteria optimization problem. For instance, when enhancing the image quality one should reduce the noise but keep the image’s small details intact. The present paper considers a way to design multidimensional digital polynomial filters (discrete Volterra filters) which are optimal with respect to several criteria. Our method results in Pareto optimal filter coefficients and can be applied to filters of arbitrary order and dimension. We give the necessary theoretical proofs and experimental results that prove the efficiency of our method. 2. VECTOR REPRESENTATION OF POLYNOMIAL FILTERS A multidimensional polynomial filter is characterized by an interval of a multidimensional discrete Volterra series [2] y(n) = h0 +
M m=1
339
Hm [x(n)] ,
(1)
340
SCHERBAKOV
where n is a vector [n1 n2 . . . nr ], x(n) and y(n) denote the r-dimensional input and output signals, respectively, h0 is the constant term, and Hm [x(n)] is a uniform polynomial filter of order m of the form ym (n) = Hm [x(n)] =
...
hm (n1 , . . . , nm )
m
x(n − nj ).
(2)
j=1
n1 ,...,nm ∈r
Here the summation over vector arguments nj deals with the filter support which is an r-dimensional lattice of the form r = {(n1 , . . . , nr ) : 0 ≤ nj ≤ Nj − 1; j = 1, . . . , r}.
(3)
Note that for m = 1 the kernel h1 (n) is a usual impulse characteristic of a multidimensional linear filter, while the kernels hm (n1 , . . . , nm ), m = 2, . . . , M , can be thought of as impulse characteristics of higher orders that characterize nonlinear properties of a polynomial filter. To represent the expression (2) in a more compact vector form, we define the lexicographic ordering of lattice elements (3) into a one-dimensional sequence with index i. To do so, we use a mixed based notation, in which the number i is represented as i = nr +
r−1
ni Nr Nr−1 . . . Ni+1 .
(4)
i=1
This expression allows to set up a bijective correspondence between components of vector n, defined over the r-dimensional lattice (3), and values of i that lie in the interval from zero to N = N1 × . . . × Nr . Mapping, by (4), each vector argument nj , j = 1, . . . , m, into an index ij , we represent the kernel hm (n1 , . . . , nm ) as a function of scalar arguments hm (i1 , . . . , im ), and the input signal x(n − nj )—as xn (ij ). As a result, the r-dimensional nonlinear filter (2) with impulse characteristic of vector arguments is transformed into an equivalent one-dimensional nonlinear filter with impulse characteristic hm (i1 , . . . , im ) of scalar arguments ym (n) = Hm [x(n)] =
N −1 i1 =0
...
N −1
hm (i1 , . . . , im )
im =0
m
xn (ij ).
(5)
j=1
The domain of the impulse characteristic hm (i1 , . . . , im ) is now an m-dimensional lattice of the form m = {(i1 , . . . , im ) : 0 ≤ ij ≤ N − 1; j = 1, . . . , m} .
(6)
Elements of this multidimensional region may also be ordered into a one-dimensional sequence via the mapping Im =
m
ij N m−j .
(7)
j=1
Thus, by subsequent applications of transformations (4) and (7) kernel components hm (n1 , . . . , nm ) can be ordered into a vector hm . Now the number of vector component hm relates to the multidimensional kernel arguments hm (n1 , . . . , nm ) as Im =
m j=1
N m−j
r−1
nij Sj ,
(8)
i=1
where Sj = Nj+1 Nj+2 × . . . × Nr (j < r), Sr = 1, and nij are components of the vector ni . For example, the value h2 (0, 1, 2, 1) of a two-dimensional kernel defined in a 2 × 3 region will first, via (4), be mapped by space coordinates into h2 (1, 5), and then, via (7), by the remaining two coordinates (that define a nonlinear interaction), into the value h2 (11). We’d get the same result if we immediately applied (8). AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
DESIGNING PARETO OPTIMAL NONLINEAR FILTERS FOR IMAGE PROCESSING
341
To rewrite (5) in a matrix form, we introduce the input signal vector xT n =
xn (0) xn (1) . . . xn (N − 1) .
Then, by the properties of Kroneker degree, the vector representation of a uniform filter (5) can be rewritten as (m) ym (n) = hT m xn ,
(9)
where the vector hm of filter coefficients contains the values of nonlinear impulse characteristic hm (n1 , . . . , nm ) ordered by (8). Without loss of generality, we can consider kernels hm (i1 , . . . , im ), m > 1, corresponding to commutative products of input counts, symmetric functions of arguments i1 , . . . , im . Consequently, it suffices to leave in (5) only factors corresponding to different combinations of arguments i1 , . . . , im . Due to (4), each vector nj is uniquely mapped into an index ij , which can be though of as a mixed base number. This allows for introducing and ordering selecting only those combinations of vector arguments for which i1 ≤ i2 ≤ . . . ≤ im . Then the number of factors in an equivalent representation (5) will equal the binomial coefficient of choosing m elements with repetitions from N elements, and the total number of such factors for the nonlinear filter (1) will be m M m LM = M m=0 CN +m−1 = CN +M , where Cn is the binomial coefficient of n by m. Thus, a multidimensional polynomial filter defined by an interval of the series (1) can be characterized by a LM × 1 coefficient vector .. . . .. T .. T .. T T . . h = h0 .. h1 .. h2 .. . . . .. hM , T
which is composed of vectors hm corresponding to nonlinear elements of various orderings. Here the components of the vector hm correspond to different combinations of indices i1 , . . . , im and are equal to P (i1 , . . . , im )hm (i1 , . . . , im ). The coefficient P (i1 , . . . , im ) is the number of permutations of the indices (i1 , . . . , im ) and equals m!/s1 ! . . . sk !, where si is the number of elements in the ith group of equal indices, e.g., P (1, 1, 2, 2, 2) = 5!/2!3!. Similarly, we form the vector of products of the input signal counts: . T ..
.. T ... (2) T ... (M ) T . , .. . . . .. xn Xn = 1 .. xn .. xn (m)
where the elements of the Kroneker product xn are products xn (i1 ) × . . . × xn (im ), and the bar denotes the operation of forming a vector that contains only combinations of (i1 , . . . , im ) indices. Thus, a multidimensional polynomial filter defined by a finite series (1) can be represented in the following simple vector form: y(n) = hT Xn , linear with respect to the vector h that contains
M CN +M
(10) coefficients.
3. SYMMETRICAL PROPERTIES OF A TWO-DIMENSIONAL NONLINEAR IMAGE PROCESSING FILTER When designing two-dimensional nonlinear filters for image processing, one usually imposes an additional requirement that the constructed operator is isotropic with respect to changing the original image’s orientation [5, 9]. In this case, which is important in practice, the number of filter coefficients reduces substantially. The isotropicity condition usually reduces to the condition that the result is invariant under rotation by multiples of 90◦ and mirroring with respect to the vertical axis. We denote these transformations by rotα and ref respectively, where α denotes the rotation angle. Note that mirroring with respect to the horizontal axis can be represented as a composition of the rot90 and ref operators. AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
342
SCHERBAKOV
Table 1. A quadratic isotropic filter with a 3 × 3 mask Order k
no.
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1
2
Coefficient vector h h0 h1 (0) h1 (1) h1 (4) h2 (0, 0) h2 (0, 1) h2 (0, 2) h2 (0, 4) h2 (0, 5) h2 (0, 8) h2 (1, 1) h2 (1, 3) h2 (1, 4) h2 (1, 7) h2 (4, 4)
Size of the group 1 4 4 1 4 16 8 8 16 4 4 8 8 4 1
Input vector x 1 x0 x1 x4 x0 x0 + x2 x2 + x6 x6 + x8 x8 2[x0 (x1 + x3 ) + x2 (x1 + x5 ) + x6 (x3 + x7 ) + x8 (x5 + x7 )] 2(x2 + x6 )(x0 + x8 ) 2x4 (x0 + x2 + x4 + x8 ) 2[x0 (x5 + x7 ) + x2 (x3 + x3 ) + x6 (x1 + x5 ) + x8 (x1 + x3 )] 2(x0 x8 + x2 x6 ) x1 x1 + x3 x3 + x5 x5 + x7 x7 2(x1 + x7 )(x3 + x5 ) 2x4 (x1 + x3 + x5 + x7 ) 2(x1 x7 + x3 x5 ) x4 x4
Suppose, for instance, that the filtering is done in a two-dimensional region 2 , defined by (3), for N1 = N2 = N . Then for various transformations of the region 2 the index i that defines a lexicographic ordering on the vector [n1 n2 ], is transformed into i as follows: rot90 : i = N (1 + i mod N ) − i/N − 1; rot180 : i = N 2 − i − 1,
rot270 : i = N (N − 1 − i mod N ) + i/N ; ref : i = i + N − 1 − 2(i mod N ), where a denote the nearest integer not exceeding a. As a result, coefficients hk (i1 , . . . , im ) of an isotropic nonlinear operator are divided into groups of equal coefficients. To form a group of equal coefficients, it suffices to take an arbitrary coefficient, suppose h2 (1, 2), and, using the transformations above, generate other coefficients in its group. Let, for instance, N = 3, then h2 (0, 1) will generate the following equal coefficients: h2 (3, 6), h2 (8, 9), h2 (4, 7), h2 (1, 2), h2 (0, 3), h2 (5, 8), h2 (6, 7). Table 1 shows a description of a quadratic filter with a 3 × 3 mask, which will be used in subsequent demonstrations of our method. It is clear from Table 1 that this operator is characterized by only 15 coefficients as compared to 91 coefficients in the general case. 4. SOLVING THE OPTIMAL POLYNOMIAL FILTERING PROBLEM In the general case, the mathematical formulation of an optimal filtering problem is to find the vector of coefficients hopt that minimize some quality functional f (hopt ) = min f (h),
(11)
hopt ∈ G = {h ∈ R : g(h) = 0},
(12)
h∈G n
where f (•) is the target function mapping G to R, and g(•) is the vector constraint function. Properties of the designed filter are defined by the choice of target function (11). The most widely used target function is the mean squared error that characterizes how much output signals of the perfect signal y ∗ (n) differ from the designed filter y(n). This choice can be justified by the fact that the output of a nonlinear filter (10) is linear with respect to its coefficients. This allows for using efficient algorithms of minimizing quadratic functions. AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
DESIGNING PARETO OPTIMAL NONLINEAR FILTERS FOR IMAGE PROCESSING
In this case the function (11) takes the following form: 2 1 T f (hopt ) = min h Xn − y ∗ (n) , h∈G |r | n∈
343
(13)
r
where
r
is the domain of output implementation of the r-dimensional signal r = {n = (n1 , . . . , nr ) : 0 ≤ ni ≤ Ki − 1; i = 1, . . . , r},
and |r | is the number of elements in this implementation. One should distinguish r from the support r of the filter in form (3). The constraint function (12) is defined by the specific features of the designed filter. In particular, for quadratic filtration the requirement of keeping a constant brightness level in a uniform region of the image leads to the following linear constraints: h0 = 0,
h1 (i1 ) = 1,
i1
i1
h2 (i1 , i2 ) = 0,
(14)
i2
which can be represented in matrix form as follows: Ah = b.
(15)
In case of a quadratic filter with a 3 × 3 mask represented in Table 1, A and b take the following values:
A=
4 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 16 8 8 16 4 4 8 8 4 1
bT = [ 1 0 ].
,
(16)
The target function can also be expressed in matrix form: f (h) = hT Rh − 2hT r + d,
(17)
where R denotes the autocorrelation matrix of input count products: 1 Xn XT R= n, |r | n∈ r
r is the vector of mutual correlations between the desired output and the products of input signal counts 1 ∗ y (n)Xn , r= |r | n∈ r
and d is the mean quadratic value of the desired output signal of the filter 1 2 y (n). d= |r | n∈ r
Thus, if we are to design an optimal polynomial filter with respect to one criterion, we have to solve the problem of minimizing the quadratic function (17) on a linear subspace defined by constraint (15). The solution of this problem is well known [9] and is reachable from any initial point h[0] in one step as shown below:
hopt = h[0] + Z(ZT RZ)−1 ZT r − Rh[0] ,
(18)
where Z is the matrix whose columns form a basis of the zero-space generated by the rows of the constraint matrix A, i.e., satisfying the condition AZ = 0. 5. MULTICRITERIA SYNTHESIS OF OPTIMAL POLYNOMIAL FILTERS In many applications, the desired digital filter should conform to several prerequisites at once. The prerequisites are often contradictory. For example, a classical problem of image processing is AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
344
SCHERBAKOV
to increase the sharpness of an image. Using a simple linear high frequency filter often leads to the undesirable increase in the noise components of the image. Thus, in this case the problem is to design such a nonlinear filter that sharpens an image without a substantial noise increase. Consider the problem of designing polynomial filters by several criteria defined by target functions fj (h), j = 1, 2, . . . , J, of the form (17), which have to be optimized together under the constraints (15). To do so, we use an approach to multicriteria optimization based on the theoretical results of [11–13]. We call fj (h) the jth criterion for an alternative h. We consider an alternative h better than an alternative d with respect to a criterion fj if fj (h) < fj (d), and we consider them equivalent with respect to fj if fj (h) = fj (d). Definition 1. An alternative h is better than an alternative d with respect to a system of criteria {f1 , f2 , . . . , fJ } if fj (h) ≤ fj (d) for all j and fj (h) < fj (d) for at least one j. It is usually impossible to find an alternative h that would be optimal with respect to every criterion fi . However, we can try to find the so-called Pareto optimal alternatives [11–13]. Definition 2. An alternative h is called Pareto optimal for a system of criteria fj , j = 1, 2, . . . , J, if there is no alternative d that would be better in the sense of Definition 1. In other words, for each d = h either there exists an index j such that fj (d) > fj (h) or for all j fj (d) = fj (h). By P (f1 , f2 , . . . , fJ ) we denote the set of Pareto optimal alternatives for a system of criteria {f1 , f2 , . . . , fJ }. We have to note that the initial problem of finding the set P (f1 , f2 , . . . , fJ ) under linear constraints (15) can be reduced to an equivalent problem without constraints. Indeed, if the matrix A consists of p linearly independent rows, then we can always represent the minimization problem in the n-dimensional space as a problem on its (n − p)-dimensional subspace which is the zero space of the matrix A. Therefore, the theorems stated below, which form the theoretical foundations of Pareto optimal filtering, are formulated about the unconstrained optimization problem. To find the set P (f1 , f2 , . . . , fJ ) of all Pareto optimal alternatives for a system of criteria {f1 , f2 , . . . , fJ } we can use the well-known technique of scalarizing a vector criterion. To do so, we introduce the vector λ = [λ1 λ2 . . . λJ ] of nonnegative numbers (weights) such that λ1 + λ2 + . . . + λJ = 1. These vectors form a compact simplex Λ ⊂ RJ . We also define the function of the form Fλ (h) =
J
λj fj (h),
(19)
j=1
which we call the weighted criterion. Consider the set S = {h0 : Fλ (h0 ) = min(Fλ (h)), h ∈ Rn , λ ∈ Λ} of minima for the weighted criterion Fλ (h). Theorem 1. Suppose that single criteria fi : Rn → R (i = 1, 2, . . . , J) are differentiable functions, the weighted criterion Fλ has a global minimum for all λ ∈ Λ, and it is the only point where grad Fλ = 0. Then S = P (f1 , f2 , . . . , fJ ). The proof of Theorem 1 is given in the Appendix. Note that for quadratic functions of the form (17) conditions of Theorem 1 hold because the correlation matrix R is positive definite. Theorem 2. Let P ({Fλ : λ ∈ Λ}) be the set of Pareto optimal alternatives for a set of weighted criteria Fλ , λ ∈ Λ. Then P ({Fλ : λ ∈ Λ}) = P (f1 , f2 , . . . , fJ ). AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
DESIGNING PARETO OPTIMAL NONLINEAR FILTERS FOR IMAGE PROCESSING
345
The proof of Theorem 2 is given in the Appendix. Similar results in a somewhat different formulation were obtained in [12, 13]. Thus, the problems of finding the best alternatives for a set of single criteria fj , j = 1, 2, . . . , J, and for a set of weighted criteria Fλ , λ ∈ Λ, are equivalent. Fix some b ∈ Λ. Let h∗b be the minimum point of the weighted criterion for this b. Then, by Theorem 1, the point h∗b is a Pareto optimal alternative for the system of criteria {f1 , f2 , . . . , fJ }. Due to Theorem 2, this alternative will also be optimal for the set of weighted criteria Fλ , λ ∈ Λ. These criteria take values defined by the following linear relation: pb (λ) = pb (λ1 , λ2 , . . . , λJ ) = Fλ (h∗b ) =
J j=1
λj fj (h∗b ),
(20)
which characterizes the behavior of a nonlinear filter with the optimal vector of coefficients h∗b on the compact simplex Λ. We call this function the filter plane for h∗b . Consider also the function of a vector argument w(λ) = w(λ1 , λ2 , . . . , λJ ) = Fλ (h∗λ ),
(21)
which shows how the minimum of the weighted criterion (19) changes depending on the parameters λ1 , λ2 , . . . , λJ . We call this function the optimal solving function. Due to the known facts of convex analysis concerning the minimum of a family of linear functions, we can conclude that the optimal solving function w(λ) is a convex function, and the filter plane pb (λ) is tangent to this function at the point b, i.e., w(λ) ≤ pb (λ) everywhere on the compact simplex Λ, and w(b) = pb (b). Note that the convexity degree of the optimal solving functions w(λ) can be interpreted as a measure of how contradictory the criteria fj , j = 1, 2, . . . , J, used for designing the filter are. When designing an optimal filter by several criteria, one should aim for the filter plane to approximate the optimal solving function as good as possible. Obviously, one cannot provide a good approximation of a convex surface of the optimal solving function by a linear function for all values of λ ∈ Λ. However, it is quite possible to reach a certain balance of errors by choosing a suitable λ (tangent point) by solving, for example, the following minimax problem: min max {w(λ) − pβ (λ)} . β∈Λ λ∈Λ
(22)
In practice, one often needs to strengthen or weaken certain quality criteria. The presented approach allows for an easy adaptation of filter behavior by choosing corresponding weights λ1 , λ2 , . . . , λJ , preserving at the same time its optimal properties. 6. EXPERIMENTAL RESULTS As we have already noted, the offered filter design method does not have any theoretical restrictions. However, in our experiments we have studied a quadratic isotropic filter with a 3 × 3 mask (see Table 1). We have made this choice to be able to compare our results with known results shown in [4, 7]. To present our results in an illustrative graphical format, we restrict ourselves to designing a filter by two criteria. Consider, for example, the problem of reducing an image’s noise while preserving the sharpness, i.e., without losing the high-frequency components. This problem can be thought of as a twocriteria problem of designing a nonlinear filter. The first criterion aims to provide a substantial noise reduction in uniform regions of the image, while the second aims to preserve or even enhance the sharpness of the image’s details in saturated regions of the image. To assess the filter’s behavior by the first criterion, we have used a test image of 64 × 64 pixels which is a two-dimensional Gaussian noise with variance σ12 = 900 and mean equal to the average of AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
346
SCHERBAKOV
the brightness range 0 ÷ 255. This image was used as the input signal for the filter, and the average brightness level—as the desired output. To estimate the filter quality by the second criterion, one can use test images with a lot of small details that differ in size and orientation and are comparable with the filter mask size. We have used different images of this kind. Figure 1a shows one of them. This image was used as the desired output. The input image was obtained from the initial (desired) by adding noise with variance σ22 = 400 (Fig. 1b). Due to (19), in case of two criteria the optimal solving function is a function of one argument: Fλ (h) = (1 − λ)f1 (h) + λf2 (h).
(23)
This function was minimized for different values λ ∈ [0, 1]. We have used various initial points, but, as was to be expected, the optimal coefficients vector h∗ did not depend on the initial approximation. Our results are shown on Fig. 2. As the graphs clearly indicate, the optimal solving function is convex (Fig. 2a). Figure 2b shows the differences fj (h∗λ ) − fjo , j = 1, 2, with respect to λ, where fjo is the minimum of the jth target function. These dependencies are monotone functions. The filter plane (a straight line in this case) corresponding to the intersection point λ = 0.58 of these curves is shown on Fig. 2a and is optimal in the sense of (22). Figure 2c shows the dependence f1 (h)/f2 (h) that can be used for choosing the value of λ corresponding to the given relations between criteria. In particular, the criteria are balanced at the point λ = 0.9. Figures 1c–1f shows images obtained after filtering with different values of λ. The results of a similar experiment for another input image are shown on Figs. 3 and 4. Here the intersection point corresponds to the value λ = 0.47, and the criteria are balanced at λ = 0.62. The resulting images clearly show that as λ increases, the sharpness of the filter increases, and, as one would expect, the smoothing effect reduces. However the solution found for each λ is Pareto optimal, i.e., an arbitrarily small deviation would lead to a loss for at least one quality criterion. To compare our approach to known ones, the following experiment used the test image “Elena” of size 256 × 256 pixels, as shown on Fig. 5a. This image was distorted by adding a Gaussian noise with variance σ22 = 100 and slightly reducing its dynamic range (Fig. 5b). A fragment of the
Fig. 1. Filtering the first test image: (a) original image; (b) noisy image; (c)–(f) filtering results for different values of λ. AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
DESIGNING PARETO OPTIMAL NONLINEAR FILTERS FOR IMAGE PROCESSING w(λ), p(λ) 1500 1000
347
(a)
p(λ)
500
w(λ)
0
0.2
0.4
fj (h*λ)– fjo 2000
0.6
0.8
(b)
1.0 λ
j=1
j=2 1000
0
0.2
0.4
f1(h)/f2(h) 3
0.6
0.8
1.0 λ
0.6
0.8
1.0 λ
(c)
2 1 0
0.2
0.4
Fig. 2. Results of designing a quadratic filter for the first test image.
Fig. 3. Filtering the first test image: (a) original image; (b) noisy image; (c)–(f) filtering results for different values of λ.
resulting image of size 64 × 64, full of small details (the top left corner of Fig. 5b) was used to assess filter properties that have to do with preserving image contours. To assess the filter’s smoothing properties we used an image consisting of pure noise with large variance σ12 = 900. Figure 6 shows the optimal solving function computed for this pair of criteria. It also shows the designed filter planes for λ = 0.65 (Table 2) and several known filters from [5, 8]. As Fig. 6 clearly indicates, the AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
348
SCHERBAKOV w(λ), p(λ) 600 p(λ) 400 200 0
w(λ) 0.2
(a)
0.4
fj (h*λ)– fjo 1500 1000
0.6
0.8
(b)
1.0 λ
j=2 j=1
500 0
0.2
0.4
0.6
0.8
1.0 λ
0.6
0.8
1.0 λ
(c)
f1(h)/f2(h) 3 2 1 0
0.2
0.4
Fig. 4. Results of designing a quadratic filter for the second test image.
Fig. 5. Filtering the “Elena” image: (a) original image; (b) image used for tuning the filter; (c) the test image after adding noise and reducing dynamic range; (d) the result of nonlinear filtering the image (c) with coefficients from Table 2 for λ = 0.65. AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
DESIGNING PARETO OPTIMAL NONLINEAR FILTERS FOR IMAGE PROCESSING
349
w(λ), p(λ) 450 400 4 350 3
300 250
2
200 p(λ)
1
150 100
w(λ) 50
0
0.2
0.4
0.6
0.8
1.0 λ
Fig. 6. Nonlinear filter planes: (1) computed filter with coefficients from Table 2; (2) smoothing filter of type 0 from [5]; (3) smoothing filter of type 1 from [5]; (4) smoothing contour preserving filter from [8].
designed optimal filter plane lies below the planes of known filters and better approximates the optimal solving function, touching in at the point λ = 0.65. In the last experiment, the computed filter with coefficients from Table 2 was tested on the problem of restoring noisy images with low sharpness. The test image was created by adding to the original “Elena” image on Fig. 5a a Gaussian noise with variance σ22 = 400 and compressing its dynamic range to 16 levels with respect to the average brightness. The resulting distorted image (Fig. 5c) was processed with the designed filter with coefficients from Table 2. As Fig. 5d shows, the filtering succeeded in significantly reducing noise while simultaneously enhancing the sharpness and contrast of the image. Table 2. Coefficients Order k no. 1 1 2 3 4 5 6 7 2 8 9 10 11 12 13 14
AUTOMATION AND REMOTE CONTROL
of the designed quadratic filter Coefficients filter h1 (0) −0.509257501 h1 (1) 0.196929080 h1 (4) 2.249313682 h2 (0, 0) 0.001722221 h2 (0, 1) −0.000962348 h2 (0, 2) 0.000265639 h2 (0, 4) −0.000765580 h2 (0, 5) 0.000385657 h2 (0, 8) 0.001515148 h2 (1, 1) 0.002107859 h2 (1, 3) −0.000026768 h2 (1, 4) −0.001426891 h2 (1, 7) 0.000319735 h2 (4, 4) 0.002196003
Vol. 71
No. 2
2010
350
SCHERBAKOV
7. CONCLUSION We have offered an approach to designing polynomial filters of arbitrary order and dimension characterized by an interval of a discrete functional Volterra series. The output signal for this class of nonlinear filters depends linearly on the parameters. Using symmetry properties and the fact that Volterra kernels are isotropic, we could significantly reduce the number of coefficients of the filter, and their lexicographic ordering allowed us to reduce a multidimensional filter to a one-dimensional one and represent it in the vector form. The problem of designing polynomial filters is formulated as a problem of multicriteria optimization, which is in essence to minimize several quadratic functions under linear constraints. The solution for this problem could be obtained by optimizing by a weighted criterion that produces a set of Pareto optimal alternatives. Changing the minimum of the weighted criterion is a convex optimal solving function, and a Pareto optimal filter is characterized by a plane tangent to it. In this interpretation, the problem of designing a nonlinear filter reduces to finding the filter plane that approximates the optimal solving function as close as possible. As the criteria become more contradictory, the curvature of the optimal solving function increases, and it becomes harder to approximate it with a plane. Experiments we have conducted in computerized image processing have confirmed theoretical conclusions and have shown how designed filters favorably compare to known filters of the same class. Further research in this direction could deal with designing adaptive polynomial filters based on piecewise linear approximation of the optimal solving function that depends on local statistical properties of an image. APPENDIX Proof of Theorem 1. Let us denote gj (h) = gradfj (h), j = 1, 2, . . . , J, and define the function R(h) = { λj gj (h) : λ ∈ Λ}. Theorem 1 immediately follows from the following three lemmas. Lemma 1. S ⊂ P (f1 , f2 , . . . , fJ ). Proof. Let h ∈ S and d ∈ Rn . Let us show that h ∈ P (f1 , f2 , . . . , fJ ). If ∀ jfj (d) = fj (h), then we are in the trivial case, and h ∈ P (f1 , f2 , . . . , fJ ). Suppose that d differs from h by one or more criteria. By definition of the set S, we can find such λ that Fλ (h) < Fλ (d) and, consequently, ∃ jfj (d) > fj (h). This completes the proof of Lemma 1. We define the set C = {h ∈ Rn : 0 ∈ R(h)}, i.e., the set of vectors h such that R(h) contains the point 0 = [0 0 . . . 0] ∈ Rn . The following lemmas hold. Lemma 2. C ⊂ S.
Proof. Let h ∈ C. Then there exist nonnegative numbers λ1 , λ2 , . . . , λJ such that λj = 1 and λi gi (h) = grad Fλ (h) = 0. Therefore, by the assumptions of Theorem 1 we can conclude that h ∈ S.
Lemma 3. P (f1 , f2 , . . . , fJ ) ⊂ C. Proof. Suppose that h ∈ / C. We will show that in this case h ∈ / P (f1 , f2 , . . . , fJ ). First, let n us note that 0 ∈ / R(h). It is easy to see that the two sets {0} ∈ R and R(h) ∈ Rn are convex, compact, and separable. Therefore, by the Hahn–Banach theorem [14] there exists a linear function separating them. This means that there exists a vector γ ∈ Rn and a positive number θ such that the product (γ, r) > θ > 0 for all r ∈ R(h). Since gj (h) ∈ R(h), for all j ∈ {1, 2, . . . , J} we have (γ, gj (h)) > θ > 0, and one can choose a small positive number ε such that fj (h − εγ) < fj (h) for all j. This means that the vector h is not a Pareto optimal alternative, because the point h − εγ is better than h with respect to every criteria fj , j = 1, 2, . . . , J. AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010
DESIGNING PARETO OPTIMAL NONLINEAR FILTERS FOR IMAGE PROCESSING
351
This completes the proof of all lemmas and, consequently, of Theorem 1. Proof of Theorem 2. Let h ∈ P (f1 , f2 , . . . , fJ ). Then, by Definition 2, for each d ∈ Rn either (a) ∃ jfj (d) > fj (h) or (b) ∀ jfj (d) = fj (h) holds. Similarly, if h ∈ P ({Fλ : λ ∈ Λ}), then either (c) ∃ λFλ (d) > Fλ (h) or (d) ∀ λFλ (d) = Fλ (h) holds. Therefore, it suffices to prove that either (a) or (b) holds if and only if either (c) or (d) holds. Obviously, (b) and (d) are equivalent. If (a) holds then one can always choose λ such that λj = 1 and λi = 0, i = j, for which Fλ (d) > Fλ (h), i.e., (c) will also hold. If (c) holds, one can always choose j such that fj (d) > fj (h), and (a) will hold, too. Thus, the two sets are equal, and Theorem 2 is proven. REFERENCES 1. Pitas, I. and Venetsanopulos, A.N., Nonlinear Digital Filters, Boston: Kluwer, 1990. 2. Mathews, V.J. and Sicuranza, G.L., Polynomial Signal Processing, New York: Wiley, 2000. 3. Mitra, S.K. and Sicuranza, G.L., Nonlinear Image Processing, San Diego: Academic, 2001. 4. Sicuranza, G.L., Quadratic Filters for Signal Processing, Proc. IEEE , 1992, vol. 80, pp. 1263–1285. 5. Ramponi, G.F., Bi-impulse Response Design of Isotropic Quadratic Filters, Proc. IEEE , 1990, no. 4, pp. 665–677. 6. Ramponi, G.F. and Fontanot, P., Enhancing Document Images with a Quadratic Filter, Signal Proc., 1993, vol. 33, no. 1. pp. 23–34. 7. Ramponi, G.F. and Sicuranza, G.L., Image Sharpening Using a Polinomial Operator, Proc. 11 Eur. Conf. Circuit Theor. Design, ECCTD’ 93 , Davos, Switzerland, 1993, pp. 1431–1436. 8. Ramponi, G.F., Sicuranza, G.L., and Ukovich, W., A Computational Method for the Design of 2-D Nonlinear Volterra Filters, IEEE Trans. Circuits Syst., 1988, vol. 35, no. 9, pp. 1095–1102. 9. Shcherbakov, M.A. and Sorokin, S.V., Designing Isotropic Polynomial Filters for Image Processing, in Problemy avtomatizatsii i upravleniya v tekhnicheskikh sistemakh (Automation and Control Problems in Technical Sustems), Penza: IITs PGU, 2008, pp. 347–355. 10. Gill, P.E., Murray, W., and Wright, M.H., Practical Optimization, London: Academic, 1981. Translated under the title Prakticheskaya optimizatsiya, Moscow: Mir, 1985. 11. Keeney, R. and Raiffa, H., Decisions with Multiple Objectives: Preferences and Value Tradeoffs, New York: Wiley, 1976. Translated unter the title Prinyatie reshenii pri mnogikh kriteriyakh: predgochteniya i zameshcheniya, Moscow: Radio i Svyaz’, 1981. 12. Podinovskii, V.V. and Nogin, V.D., Pareto-optimal’nye resheniya mnogokriterial’nykh zadach (Pareto Optimal Solutions of Multi-Criteria Problems), Moscow: Nauka, 1982. 13. Dubov, Yu.A., Travkin, S.I., and Yakimets, V.N., Mnogokriterial’nye modeli formirovaniya i vybora variantov sistem (Multi-criteria Models for Designing and Choosing System Options), Moscow: Nauka, 1986. 14. Balakrishnan, A.V., Applied Functional Analysis, New York: Springer-Verlag, 1976. Translated under the title Prikladnoi funktsional’nyi analiz , Moscow: Nauka, 1980.
This paper was recommended for publication by A.I. Kibzun, a member of the Editorial Board
AUTOMATION AND REMOTE CONTROL
Vol. 71
No. 2
2010