Acta Informatica 34, 347–366 (1997)
c Springer-Verlag 1997
Implementing Daubechies wavelet transform with weighted finite automata? Karel Culik II1,?? , Simant Dube2,??? 1
Department of Computer Science, University of South Carolina, Columbia, SC 29208, USA Iterated Systems, Inc., 3525 Piedmont Road, Seven Piedmont Center, Suite 600, Atlanta, GA 30305-1530, USA 2
Received: 15 February 1995 / 11 April 1996
Abstract. We show that the compactly supported wavelet functions W2 , W4 , W6 , . . . discovered by Daubechies [6] can be computed by weighted finite automata (WFA) introduced by Culik and Karhum¨aki [2]. Furthermore, for 1-D case, a fixed WFA with 2n + n(N − 2) states can implement any linear combination of dilations and translations of a basic wavelet WN at resolution 2n . The coefficients of the wavelet transform specify the initial weights in the corresponding states of the WFA. An algorithm to simplify this WFA is presented and can be employed to compress data. It works especially well for smooth and fractal-like data. 1 Introduction Weighted finite automata (WFA) have been introduced in [2] and studied as devices computing real functions. A WFA over alphabet {0, 1} defines a real function of one variable, on the interval [0, 1]. A WFA over alphabet {0, 1, 2, 3} defines a real function of two variables, on the unit square [0, 1] × [0, 1], which can be interpreted as a grey-scale image. The recursive inference algorithm for WFA [5, 4] is the base of the simplest and at the same time the best performing “fractal” method for image-data compression. Wavelets are new families of basis functions which yield “multi-resolution” representation of a function. Dilations and translations of a basic wavelet function W form an orthonormal basis for a wavelet transform in 1-dimensional case (their Cartesian products are used for higher dimensions) [9]. One of the major breakthroughs in wavelet research has been the discovery of compactly supported wavelets by Daubechies [6] which can be viewed as generalizations of the well-known Haar basis. In this paper, we show that the Daubechies compactly supported wavelet functions ? ?? ???
A preliminary version of part of the results was presented at at 10th STACS in Wuerzburg, 1993. Research was supported by the National Science Foundation under Grant No. CCR-9417384. This research was done when the second author was a Canada International Fellow at the University of Ontario, Canada on leave from the University of New England, Australia. It was supported by Natural Sciences and Engineering Research Council of Canada and by the Australian Research Council under Grant No. A49330441.
348
K. Culik II, S. Dube
W2 (Haar wavelet), W4 , W6 , . . . can be easily implemented by WFA because of their recursive nature. Even more importantly, for Haar wavelet, we implement a set of 2n wavelet functions forming a basis for the discrete wavelet transform by a fixed WFA with 2n states (one state per wavelet) so that any wavelet transform, i.e. a function expressed in the basis with the coefficients a1 , . . . , a2n is specified by the fixed WFA just by using the coefficients a1 , . . . , a2n as the initial weight distribution in the corresponding states. Such a fixed automaton can be constructed also for higher basic wavelets, however, it will need 2n + n(N − 2) states for WN . These are the first non-contrived examples of WFA whose diagrams are strongly connected directed graphs. In [2] most results have been shown for level WFA which are WFA with a tree-like structure. It has been shown there that many functions, e.g. all polynomials can be implemented by level WFA. Therefore we can combine naturally the wavelet transform based compression technique [7] with the WFA compression techniques described in [3, 4, 5]. First, we compute the wavelet transform coefficients, possibly quantize them, use them as the initial weight distribution in the fixed WFA and then simplify this WFA. This method combines the advantages of the wavelet transform technique with the WFA compression technique increasing the compression especially for repetitive or fractal (self similar) or smooth functions (images). The decoding is done by a WFA-decoder that is the same as in [3], i.e. it does not know that wavelets have been used in the encoding process. We have implemented this compression method for both 1-dimensional and 2dimensional functions (images). We will show examples of results for both, however, we explain the technique only for the 1-dimensional case, the generalization to two dimensions is straightforward. 2 Weighted finite automata Weighted finite automata, a generalization of Probabilistic Finite Generators considered in [1], is a tool to define “multi-resolution” functions on d-dimensional space [0, 1]d . Consider the case d = 1. The empty word ε denotes the unit interval [0,1]. The unit interval [0, 1] can be divided into two halves [0,1/2] and [1/2,1] denoted by symbols 0 and 1, respectively. These two halves can be further divided into four subintervals [0,1/4], [1/4,1/2], [1/2,3/4] and [3/4,1] which in turn are denoted by strings 00, 01, 10 and 11, respectively. In general if a subinterval is denoted by word w then its left and right halves are denoted by words w0 and w1 respectively. Thus at resolution 2k , unit interval is divided into 2k subintervals each of length 2−k with each subinterval denoted by a binary word of length k. See Fig. 1 for resolution 2k = 8.
000
001
010
011
100
101
110
0
111 1
Fig. 1. Denoting subintervals by words over binary alphabet
Informally, for 1-D case and using binary notation, a WFA is an automaton over the alphabet Σ = {0, 1}, where a real weight is associated with each of its transitions,
Daubechies wavelet transform with weighted finite automata
349
and two reals weights (initial and final) are associated with each of its states. A WFA computes a function f : [0, 1] → R as follows. Consider a string w. For each path in the automaton labeled with w, multiply the initial weight of the first state in the path and the weights of the transitions on the path and the final weight of the last state. Over all paths labeled with w, add the products computed so. This sum is the value of the function computed by the WFA for the subinterval denoted by w. This can be generalized to infinite strings which now are interpreted as points in the unit interval [0,1] e.g. the infinite string σ = 01010101 . . . represents the point 1/3, which is also the limit of the subintervals corresponding to the longer and longer prefixes of σ. Some points can be denoted by more than one infinite string, e.g. the point 1/2 has two string representations, 0111 . . . and 1000 . . .. Then we choose the second to be the standard representation. To compute the value of function at a point x, longer and longer prefixes w1 , w2 , . . . of its standard infinite string representation σ are considered, and the function evaluated for each. The limiting value is f (x). Formally, let Σ be a finite alphabet and Σ ? the set of words over Σ. The empty word is denoted by ε. We will also consider (one-way) infinite words, called ω-words, over Σ. Formally, an infinite word ω over Σ is a sequence a1 a2 . . . with ai ∈ Σ. The prefix a1 a2 . . . an of the length n of ω is denoted by prefn (w). Set of all ω-words over Σ is denoted by Σ ω . As it is well known each ω-word can be interpreted as a real number in the interval [0, 1]. Here we will use solely the binary alphabet Σ = {0, 1} and the binary representation of numbers. The only real numbers in [0, 1] which do not have unique representation are numbers represented by w10ω and w01ω for some w ∈ Σ ? . We refer to the first one as the standard representation of the considered number and may write simply w1 instead of w10ω . Clearly, the standard representation gives a one-to-one mapping Λ : Σ ω → [0, 1]. We formally introduce our tool to define real functions. A weighted finite automaton (WFA) is a 5-tuple A = (Q, Σ, W, I, F ) where (i) Q is a finite set of states. (ii) Σ is a finite alphabet (here Σ = {0, 1}). (iii) W : Q × Σ × Q → IR is a weight function. (iv) I : Q → IR is an initial (weight) distribution. (v) F : Q → IR is a final (weight) distribution. Let |Q| = t. Then, clearly the weight function W specifies a t × t transition matrix Wa for each a ∈ Σ. That is, if we number states as 1, 2, . . . , t then Wa (i, j) is the weight of the transition from state i to state j which is labeled by symbol a. We will also view I and F as real vectors of size t, I is viewed as a row vector and F as a column vector. Though, at an informal level we defined the function computed by a WFA as sum of products over paths in the automaton, an equivalent formulation is in terms of matrix multiplication. The WFA A defines a multiresolution function FA : Σ ∗ → IR by FA (a1 a2 . . . ak ) = IWa1 Wa2 . . . Wak F for each k ≥ 0 and a1 a2 . . . ak ∈ Σ ∗ . Now, we extend the function FA to a partial function on ω-words FA : Σ ω → IR by FA (ω) = lim FA (prefn (ω)) n→∞
if the limit exists.
350
K. Culik II, S. Dube 0:1−p
-
0:1
K
1:p
1:1
-
K
1:1
Fig. 2. An average preserving WFA Ap
Finally, we define the real function fA : [0, 1] → IR specified by WFA A as fA (x) = FA (w) where w is the standard binary representation of x ∈ [0, 1]. Note that the function FA can be viewed as the multiresolution representation of a function (image) fA . The values FA (w) for all w ∈ Σ k give a equidistant table of 2k values of the real function fA . In the case of 2-dimensional images it specifies the pixel values for the resolution 2k × 2k . We say that a function f : {0, 1}? → IR is average preserving if f (w) = 12 (f (w0))+ f (w1)) for each w ∈ Σ ? . A WFA A = (Q, {0, 1}, W, I, F ) is average preserving if for each q ∈ Q X W (q, a, p)F (p) = 2F (q). (1) a∈{0,1},p∈Q
It is easy to show the following: if A is an average preserving WFA, then FA is an average preserving function. Note that the concept of average preserving multiresolution function is needed to ensure “compatibility” of functions at different resolutions i.e. the function at resolution 2k can be obtained from the one at resolution 2k+1 by averaging pairs of adjacent pixels (subintervals for 1-D case). Example 1. For 0 < p < 1 an average preserving WFA Ap is shown in Fig. 2. The graphs of the function fAp for p = 14 , 34 and 12 are shown in Fig. 3. The first two are continuous fractal-like functions, fA1/2 is a linear function. It is shown in [2] that all the polynomials of the degree n are computed by an average preserving WFA. Automata with a few states and simple structure compute an enormous variety of smooth, continuous or everywhere discontinuous functions. It was shown in [2] that the only perfectly smooth functions, that is the functions that have all the derivatives everywhere in (0, 1), definable by WFA are the polynomials. That contrasts with the fact that we will demonstrate here, that all the smooth functions can be closely approximated by WFA, and that such approximations can be computed efficiently. By the underlying automaton of a WFA A we mean the nondeterministic automaton whose transitions consist of those triples (p, a, q) for which W (p, a, q) = / 0 and the initial and final states are those which get nonzero value under I and F , respectively. If I, F ∈ {0, 1}t , i.e. all the nonzeros values are 1, we talk about the initial and final states of a WFA as well, and such a WFA is fully specified by a usual diagram of a nondeterministic automaton with weights labeling the edges (besides inputs). As it is common the initial states will be shown by incoming “half-edges” and the final states by double circles.
Daubechies wavelet transform with weighted finite automata
351
3
p=0.5 p=0.75 p=0.25
2
1
0 0.0
0.2
0.4
0.6
0.8
Fig. 3. The graphs of the functions generated by A 1 , A 3 , and A 1 4
4
2
3 Wavelets L2 -norm of a function f : IR → IR is defined as Z ||f ||L2 =
2
12
f (x) dx IR
L2 (IR) is the space of square integrable functions i.e. f ∈ L2 (IR) iff ||f ||L2 < ∞. In this paper, we consider functions in L2 ([0, 1]), i.e. only those functions which are supported on the interval [0, 1]. The term wavelets denotes a family of functions of the form x−b a, b ∈ IR, a = /0 (2) Wa,b (x) = |a|−1/2 W a obtained from a single function W by the operations of dilation and translation such that they form an orthonormal basis for L2 (IR). Wavelets can be described in the framework of multiresolution analysis [8]. One of the simplest examples of a basis of wavelets is the well-known Haar basis. Unlike Fourier transform, wavelets provide localization in both space and frequency which makes them suitable for a wide variety of applications such as signal and image processing and solution of integral operators and partial differential equations. In this paper, we are interested in compactly supported wavelets W2 (Haar wavelet), W4 , W6 , . . . of Daubechies [6]. The Daubechies wavelet basis is obtained in three steps: 1. The scaling function ΦN . It is defined as ΦN (x) =
N −1 X k=0
ck ΦN (2x − k)
(3)
352
K. Culik II, S. Dube
where ci ’s are coefficients chosen soRthat they satisfy various conditions. It is convenient to normalize ΦN by choosing IR ΦN (x)dx = 1. It is easy to check that ΦN is supported on [0, N − 1]. 2. The wavelet function WN . Once we have ΦN it is used to define the wavelet function WN as N −1 X WN (x) = (−1)k cN −1−k ΦN (2x − k). (4) k=0
One can again check that WN is supported on [0, N − 1]. Sometimes the wavelet is defined slightly differently as X (−1)k c1−k ΦN (2x − k) WN (x) = k
giving a wavelet which is a translate of the one defined in (4). This different choice will affect only the dilation and translation indices of the wavelet transform coefficients in this paper. We have chosen the first one as it gives indices in accordance with the convention for the discrete wavelet transform. 3. The wavelet basis. Using two indices k for translations and j for dilations we obtain all base functions from the basic wavelet function WN by Wj,k = 2j/2 WN (2j x − k) j, k ∈ Z
(5)
which is an orthonormal basis of L2 (IR). According to (2) they are wavelets with indices a, b taking discrete values a = 2j and b = 2j k for j, k ∈ Z . Daubechies wavelet coefficients c0 , c1 , . . . , cN −1 satisfy the following conditions: P 1. Consistency condition. P k ck = 2. 2. Orthogonality condition.P m c2k+m c2l+m = 2δkl , k, l ∈ Z . 3. Smoothness condition. k (−1)k ck k j = 0, j = 0, 1, . . . , N/2 − 1. The above conditions give for N = 2, c0 = c1 = 1. Φ2 is the box function and W2 is the classic Haar wavelet 1 for 0 ≤ x < 12 W2 (x) = −1 for 12 ≤ x ≤ 1. For N = 4 we get (see [6, 9]) c0 =
√ √ √ √ 1 1 1 1 (1 + 3), c1 = (3 + 3), c2 = (3 − 3) and c3 = (1 − 3). 4 4 4 4
We obtain the scaling and the wavelet functions as respective solutions to Φ4 (x) = c0 Φ4 (2x) + c1 Φ4 (2x − 1) + c2 Φ4 (2x − 2) + c3 Φ4 (2x − 3) W4 (x) = c3 Φ4 (2x) − c2 Φ4 (2x − 1) + c1 Φ4 (2x − 2) − c0 Φ4 (2x − 3) The graph of W4 is shown in Fig. 4.
Daubechies wavelet transform with weighted finite automata
353
1
0
-1
0
1
2
Fig. 4. Graph of the wavelet function W4
4 WFA computing wavelets The Daubechies wavelets are fractal functions because of their recursive definition. In this section we show that these wavelets can be computed by WFA. 4.1 Haar wavelet It is trivial to compute the box function Φ2 and the Haar wavelet W2 by WFA. The automaton A2 and the Haar wavelet it computes are shown in Fig. 5. A WFA consisting of only the right state (with its two self-loops) of A2 in Fig. 5 computes the box function Φ2 . 0:1
0:1
-
j *
1 : −1
K
1:1
0
1 2
Fig. 5. WFA A2 and Haar wavelet W2 generated by A2
1
354
K. Culik II, S. Dube
4.2 Daubechies wavelets Before it could be shown that Daubechies wavelets WN , N = 4, 6, 8, . . . can be computed by WFA, we would see in general how recursive functions can be computed by WFA. Suppose g : [0, 1] → IR is a linear combination of N mutually recursive functions f1 , f2 , . . . , fN : [0, 1] → IR, N X Ii fi (x) g(x) = i=1
R1 where Ii ∈ IR. Let the average value of fi on [0,1] (that is, 0 fi (x)dx) be Fi . The functions f1 , f2 , . . . , fN , are mutually recursive in the following sense. There exists two N × N real matrices W0 and W1 , such that for each i = 1, 2, . . . , N , ( P N W0 (i, j)fj (2x) for 0 ≤ x < 1/2 fi (x) = Pj=1 N W (i, j)f (2x − 1) for 1/2 ≤ x ≤ 1 1 j j=1 The following lemma states how such a function g can be computed by a WFA. Φ(x) = c0 Φ(2x) + c1 Φ(2x − 1) + c2 Φ(2x − 2) + c3 Φ(2x − 3)
c0 Φ01 0
Φ01
c3 Φ01
c3 Φ12
c2 Φ01
c2 Φ12
c2 Φ23
c1 Φ01
c1 Φ12
c1 Φ23
c0 Φ12
c0 Φ23 1
Φ12
2
c3 Φ23
Φ23
3
Fig. 6. Recursive definition of the scaling function Φ4
Lemma 1 Let g, f1 , f2 , . . . , fN be functions as defined above. Then g is computed by the WFA A = ({1, 2, . . . , N }, {0, 1}, {W0 , W1 }, [I1 , I2 , . . . , IN ], [F1 , F2 , . . . , FN ]). Proof. The proof follows from the definition of how WFA compute functions. t u Note that by assigning zero initial distribution to all states except one, say state k, which is assigned initial distribution 1, the same WFA computes the function fk . Therefore we can associate the state k with fk . We say that the state k computes fk . For example, in Fig. 5, the left state of WFA A2 computes W2 and the right state computes Φ2 . We can now show how Daubechies wavelets can be computed using WFA. We do it in two steps. First we show that the scaling function ΦN can be computed by a WFA using the recursive definition (3). Then we extend this WFA to compute the wavelet function WN using (4). We show construction for only Φ4 and W4 as there is a straightforward generalization for N = 6, 8, 10, . . ..
Daubechies wavelet transform with weighted finite automata
0 : c0
Φ01
M
0 : c2
0 : c1
-
1 : c0
Y
1 : c1
355
Φ12
M
1 : c3
1 : c1
0 : c2
j -
0 : c0
Y
1 : c2
Φ23
M
0 : c3
1 : c3
Fig. 7. A WFA computing Φ01 , Φ12 and Φ23
f
@ 1:1 @ R @
0:1
g
1:1
0 : c0
/
Φ01
M
h
0:1
0 : c2
-
1 : c0
Y
/
Φ12
M
1 : c1
1 : c3
1 : c2
0 : c1
S
1 : c1
1:1
S
0 : c0
Y
0 : c2
S S w j - Φ23 M
0 : c3
1 : c3
Fig. 8. A WFA computing Φ4 (4x − 1)
W (x) = c3 Φ(2x) − c2 Φ(2x − 1) + c1 Φ(2x − 2) − c0 Φ(2x − 3)
c3 Φ01
c3 Φ12
c3 Φ23
−c2 Φ01 −c2 Φ12 −c2 Φ23 c1 Φ01 c1 Φ12 c1 Φ23 −c0 Φ01 −c0 Φ12 −c0 Φ23 0
W01
1
W12
2
W23
Fig. 9. Recursive definition of the wavelet W4
3
356
K. Culik II, S. Dube
4.2.1 The scaling function Φ4 . The scaling function Φ4 as defined by (3) is supported on the interval [0,3]. By definition, WFA compute functions supported inside the unit interval [0,1]. Thus though Φ4 as such can not be computed by a WFA, its restrictions to the intervals [0,1],[1,2] and [2,3], and all of its dilations and translations which are supported inside the unit interval can be computed by WFA. Consider the restrictions of Φ4 to [0,1],[1,2] and [2,3]. Let their translations to [0,1] (since again WFA compute functions on [0,1] only) be denoted by Φ01 , Φ12 and Φ23 respectively. We can rewrite (3) for N = 4 as, for 0 ≤ x < 1/2 c0 Φ01 (2x) Φ01 (x) = c1 Φ01 (2x − 1) + c0 Φ12 (2x − 1) for 1/2 ≤ x ≤ 1 for 0 ≤ x < 1/2 c2 Φ01 (2x) + c1 Φ12 (2x) + c0 Φ23 (2x) Φ12 (x) = c3 Φ01 (2x − 1) + c2 Φ12 (2x − 1) + c1 Φ23 (2x − 1) for 1/2 ≤ x ≤ 1 c3 Φ12 (2x) + c2 Φ23 (2x) for 0 ≤ x < 1/2 Φ23 (x) = for 1/2 ≤ x ≤ 1 c3 Φ23 (2x − 1) See Fig. 6 which shows the recursive nature of Φ4 . To completely specify a WFA computing the above functions we also need their average values. These are obtained by integrating them over the unit interval [0,1]. Note that for any function f : [0, 1] → IR, Z 12 Z 1 Z 1 f (x)dx = 2 f (2x)dx = f (2x − 1)dx. 0
0
1 2
Let F, G and H be the definite integrals (average values) of Φ01 , Φ12 and Φ23 respectively over [0,1]. By integrating both sides of the mutually recursive equations above, we get 2F = (c0 + c1 )F + c0 G 2G = (c2 + c3 )F + (c1 + c2 )G + (c0 + c1 )H 2H = c3 G + (c2 + c3 )H From the properties of the Daubechies coefficients one can verify that the above equations are of rank 2, so adding the normalizing equation Z 3 Φ4 (x)dx = F + G + H = 1 0
enables one to solve for F, G and H which turn out to be: G = 0.1666 . . . , F ≈ 0.849 and H ≈ −0.016. See Fig. 7 which shows a WFA computing the three restrictions of Φ4 (the left, middle and right states compute Φ01 , Φ12 and Φ23 respectively). The correctness of the construction follows from Lemma 1. Note that the WFA is average preserving, since (1) holds. [F G H] is an eigen-vector of the matrix W0 + W1 (W0 and W1 are the 3 × 3 transition matrices of the WFA) corresponding to the eigen-value 2. Any dilation and translation of Φ4 which is supported inside [0,1] can be computed by a WFA. For example, see Fig. 8 which shows a WFA computing Φ4 (4x − 1). The top state is the initial state. It is easy to verify, using Lemma 1, that the WFA computes function f which is defined as
Daubechies wavelet transform with weighted finite automata
357
W
@ 1:1 @ R @
0:1
W0
W1
@ 1:1 @ R @
0:1
W01
@ 0:1 @ R @
W23
W12
0 : c1 , 1 : −c0
@
@ 1 : c3 @ @
@0 : c3 , 1 : −c2
@ 0 : −c0 @ 0 : c3 0 : −c2 0 : c1 @ 1 : −c0 1 : −c2 1 : c 1 @ @ @ @ ? @ @ R ? @ R ? @ 1 : c0 0 : c0 , 1 : c 1 - Φ12 Φ01 Φ23 Y Y M M M 0 : c2 , 1 : c 3 0 : c3
0 : c0 , 1 : c 1
0 : c1 , 1 : c 2
0 : c2 , 1 : c 3
Fig. 10. A WFA computing W4 (4x)
f (x) = g(x) = h(x) =
g(2x) for 0 ≤ x < 1/2 h(2x − 1) for 1/2 ≤ x ≤ 1 0 for 0 ≤ x < 1/2 Φ01 (2x − 1) for 1/2 ≤ x ≤ 1 for 0 ≤ x < 1/2 Φ12 (2x) Φ23 (2x − 1) for 1/2 ≤ x ≤ 1
and therefore is precisely Φ4 (4x − 1). 4.2.2 The wavelet function W4 . Daubechies wavelet W4 as defined by (4) is supported on interval [0,3]. Let us denote restrictions of W4 to intervals [0,1], [1,2] and [2,3] and their translations to the unit interval [0,1] by W01 , W12 and W23 respectively. Now clearly, c3 Φ01 (2x) for 0 ≤ x < 1/2 W01 (x) = −c2 Φ01 (2x − 1) + c3 Φ12 (2x − 1) for 1/2 ≤ x ≤ 1 c1 Φ01 (2x) − c2 Φ12 (2x) + c3 Φ23 (2x) for 0 ≤ x < 1/2 W12 (x) = −c0 Φ01 (2x − 1) + c1 Φ12 (2x − 1) − c2 Φ23 (2x − 1) for 1/2 ≤ x ≤ 1 −c0 Φ12 (2x) + c1 Φ23 (2x) for 0 ≤ x < 1/2 W23 (x) = for 1/2 ≤ x ≤ 1 −c0 Φ23 (2x − 1) See Fig. 9 which shows the recursive definition of W4 . The WFA shown in Fig. 10 computes W4 (4x). The top state is the initial state. Let the final distribution of the
358
K. Culik II, S. Dube
three states computing W01 , W12 and W23 be F 0 , G0 and H 0 respectively. These can be easily computed from final distribution of the states at the bottom (i.e. from F, G and H in Sect. 4.2.1) and by integrating the above equations, 2F 0 = (c3 − c2 )F + c3 G 2G0 = (c1 − c0 )F + (c1 − c2 )G + (c3 − c2 )H 2H 0 = −c0 G + (c1 − c0 )H From F 0 , G0 and H 0 one can compute the final distribution of the remaining states. The WFA is average preserving, since (1) holds. 5 Implementing wavelet transform with WFA The wavelet transform of a function f in L2 (IR) is determined by its projection on the wavelet basis. In discrete case, f : [0, 1] → IR is sampled at regular intervals. Assume that it is specified at resolution 2d , that is, it is sampled at 2d points. Let its sample values be f0 , f1 , . . . , f2d −1 . One assumes that f (x) =
d 2X −1
fi ΦN (2d x − i)
(6)
i=0
and then projects it on the wavelets with dilation factor 2j for all j < d. The discrete wavelet transform is given by f (x) =
X
bi ΦN (x − i) +
d−1 X X
i
j=0
aj,k Wj,k (x)
(7)
k
where bi and aj,k are the transform coefficients Z 1 f (x)ΦN (x − i)dx bi = Z
0 1
aj,k =
f (x)Wj,k (x)dx 0
and can be computed in practice by applying wavelet “filters” on the data points f0 , . . . , f2d −1 , see [9, 8], in a recursive fashion. 5.1 Haar wavelet Here (7) is same as d−1 2X −1 X j
f (x) = bΦ2 (x) +
aj,k Wj,k (x).
j=0 k=0
Note that Wj,k can be obtained recursively from W2 (which is same as W0,0 ) as
Daubechies wavelet transform with weighted finite automata
359
n W3,0
W3,4
W3,2
W3,6
W3,1
W3,5
W3,3
W3,7
a3,0
a3,4
a3,2
a3,6
a3,1
a3,5
a3,3
a3,7
√A 0: 2 A
√ 1: 2 AU
√A 0: 2 A
√A 0: 2 A
√A 0: 2 A
W2,0 a2,0
W2,2 a2,2
W2,1 a2,1
W2,3 a2,3
@ √ 0: 2 @
√ 1: 2 U A
√ 1: 2 AU @ √ 0: 2 @
√ 1: 2
@ R
@ R
W1,0 a1,0
√ 1: 2 AU
√ 1: 2
W1,1 a1,1
HH
HH √ 0: 2 H
W
2 HH j a0,0 0:1
0:1
jU
√ 1: 2
1:-1
b
1:1
Φ2
f (x) = bΦ2 (x) +
Pd−1 P2j −1 j=0
k=0
aj,k Wj,k (x)
Fig. 11. A WFA implementing Haar wavelet transform
√ 2W (2x) if k ∈ {0, 1, . . . , 2j−1 − 1} Wj,k (x) = √ j−1,k 2Wj−1,k−2j−1 (2x − 1) if k ∈ {2j−1 , 2j−1 + 1, . . . , 2j − 1} See Fig. 11 which shows how a tree-like WFA can compute Wj,k . This WFA can then be used to implement wavelet transform of any function f just by using the corresponding wavelet transform coefficients aj,k as the initial distribution, see Fig. 11. The WFA is very economical, it has one state for every wavelet in the base. In case of continuous wavelet transform, one can see how a WFA with infinitely many states will implement wavelet transform (with infinitely many nonzero wavelet transform coefficients) of a function f . 5.2 Daubechies wavelets In the case of Haar basis, situation was simpler as at dilation 2j , there are exactly 2j dilations and translations of W2 supported inside the unit interval [0,1], namely Wj,k where k ∈ {0, 1, 2, . . . , 2j − 1}, and they are all nonoverlapping. But in the case of W4 , at dilation 2j , there are 2j + 2 dilations and translations, namely Wj,k where k ∈ {−2, −1, 0, 1, 2, . . . , 2j − 1} which are partially or completely supported inside the unit interval, and also they are not nonoverlapping. See Fig. 12 for an illustration of the case 2j = 4. Similarly, at dilation 2j , 2j + 2 dilations and translations of ΦN intersect the unit interval. Therefore, for N = 4 (7) is same as
360
K. Culik II, S. Dube l2,3 l2,2
m2,3
l2,1
m2,2
r2,3
l2,0
m2,1
r2,2
m2,0
r2,1
r2,0
− 12
− 14
W2,3 W2,2 W2,1
W2,0
W2,−1
W2,−2
0
1 2
1 4
1
3 4
3 2
5 4
Fig. 12. Dilations and translations of W4 at dilation factor 4
WFA D
m m m m m m m m m m m m m m m m m m m m m A
TL
A
TM A A
A A W01
A
TR
A A
W23
W12
WFA computing W4
Φ01
Φ23
Φ12
TL
TM
a2,0
a2,2
a2,1
√A 0: 2
√
√A 0: 2
1: AAU a1,0
2
a2,3
√ 1: 2 AA U
a2,3
a2,1
a2,0
a2,2
√A 0: 2
√
√A 0: 2
√
AAU 1:
a1,1
2
a1,1
@
@
√ √ 0: 2@ R 1: 2 @ a0,0 W01
AAU 1: a1,0
√ √ 0: 2@ R 1: 2 @ a0,0 W12
TR
a2,2
a2,0
√A 0: 2
1: AA U
√
a1,0
a2,3
2
a2,1
√A 0: 2
√ AAU 1: 2 a1,1
@ √
√ 0: 2@ R 1: 2 @ a0,0 W23
Fig. 13. WFA D computing a wavelet transform using W4 wavelet
2
Daubechies wavelet transform with weighted finite automata 0 X
f (x) =
361
d−1 2X −1 X j
bi Φ4 (x − i) +
i=−2
aj,k Wj,k (x)
(8)
j=0 k=−2
Remark. Since translates of ΦN are orthogonal to each other, it follows from (6) that in (8) b−2 = aj,−2 = 0. But still we have considered these because these are non-zero if “wrap-around” is used i.e. the function f is assumed to be periodically extended to its left and right boundaries. In practice, one often uses wrap-around at the boundaries while computing the discrete wavelet transform coefficients aj,k , for which the following holds aj,k = aj,2j +k , for k ∈ {−2, −1}
and b−2 = b−1 = b0 . √ 1: 2
a3,7
a3,3
a3,1
a3,5
a3,0
a3,4
a3,2
a3,6
√ 0: 2
" b " b√ √ √ √ B √ √ B √ √ B b 1: 2 0: 2 1: 2 0: 2 1: 2 0: 2 2 BN BN BN 1: 2b1:~ b ?
√B √ " 0: 2"0: 2 BN ?
a2,3
" =
a2,3
a2,1 a2,0 a2,2 √ √ a2,2 H1: 2 √ √ 0:2 √J HH 1: 2 √J
√
√ 0: 2J J 0: 2 J 1: 2 J 1: 2 ^ ^
^
HHH 0: 2
j √ √ a1,1 a1,1 a a1,0 0: 1: 2 2 1,0 PP P Z P √ √ √J
√ PP 0: 2Z 1: 2 0: 2 P 1: 2
J ^
a0,0
Z ~a Z
)
0,0
PP = P q a0,0 P
@0 : c1 , 1 : −c0 @0 : c3 , 1 : −c2 1 : c3 @ @0 : −c0 0 : c3 0 : −c2@ 0 : c1 @ 1 : −c0 1 : −c2 1 : c1 @ @ @ @ ? R ? 0 : c , 1 : c @ @ R ? 1 : c0 0 1b b b Y Y M 0 : c2 , 1 : c3 M M 0 : c3
0 : c0 , 1 : c 1
0 : c1 , 1 : c 2
0 : c2 , 1 : c 3
Fig. 14. The simplified WFA computing W4 transform
In Fig. 13 we show an implementation of such a wavelet transform with wraparound. The WFA D computes discrete wavelet transform of f specified at resolution 23 . The WFA consists of two parts: the WFA A computing the three restrictions of W4 shown in Fig. 10 and three tree-like WFAs TL , TM and TR rooted at states of A which compute W01 , W12 and W23 respectively. The states of the three trees compute lj,k , mj,k and rj,k which are dilations and translations of W01 , W12 and W23 , lj,k (x) = 2j/2 W01 (2j x − k) mj,k (x) = 2j/2 W12 (2j x − k) rj,k (x) = 2j/2 W23 (2j x − k)
362
n
K. Culik II, S. Dube
8
8
√A
0: 2 A C
8
√
1: AU
8
12
√ √A 0: 2 A 1: 2 U A
2
16
D
@ √ 0: 2 @
12
√A
0: 2 A
1: AU
16
@ R
32
HH j
64
A
0:1
0:1
2
√ 1: 2
@ R
HH √ 0: 2 H
1: AU 24
E
HH
√
0: 2 A
2
@ √ 0: 2 @
12
√A
24
√ 1: 2
B
12
√
48
√ 1: 2
1:-1
jU
100
1:1
a Original WFA with 16 states
√ 0:1/ 2
n
j
√ 0: 2
0:1
√ 1:1/ 2
√ 1:3/ 2
U 0:1
32
64
1:-1
jU
100
1:1
b Optimized WFA with 3 states Fig. 15. Illustration of the algorithm
for all k ∈ {0, 1, . . . , 2j − 1}, see Fig. 12. Note that left, middle and right one-thirds of Wj,k are precisely lj,k , mj,k+1 and rj,k+2 respectively, see Fig. 12. Since aj,k is coefficient corresponding to the wavelet Wj,k it is the initial distribution of states computing lj,k , mj,k+1 and rj,k+2 respectively in Fig. 13. Furthermore, since f is supported on [0,1] and Φ4 on [0,3], from (8) it follows that b−2 , b−1 and b0 are the initial distribution of states computing Φ23 , Φ12 and Φ01 respectively. In wrap-around, since b−2 = b−1 = b0 , let b = bj . The final distribution of the state computing lj,k is 2−j/2 F 0 where F 0 is the final distribution of state computing W01 (see Sect. 4.2.2), and in a similar fashion the final distribution of states computing mj,k and rj,k are 2−j/2 G0 and 2−j/2 H 0 . It can be verified that the WFA is average preserving, since (1) holds.
Daubechies wavelet transform with weighted finite automata
363
The WFA D in Fig. 13 can be further simplified by noticing that certain subtrees of TL , TM and TR are identical. In Fig. 14 we show the simplified WFA at resolution 24 . In general at resolution 2n it will have 2n + 2n states. It is easy to modify the initial distribution for the case when wrap-around is not used. Though the preceding discussion is for wavelets W2 and W4 , it is easy to see that it can be generalized to any WN . For general N , ΦN and WN are supported on interval [0, N − 1]. Their restrictions to N − 1 subintervals [0, 1], [1, 2], . . . , [N − 2, N − 1] can be computed by WFA, and also all their dilations and translations which are supported inside [0,1]. Furthermore, wavelet transform of any function f using WN , with or without wrap-around, can be implemented by WFA. Theorem 1 Let f : [0, 1] → IR be expressed as a linear combination of ΦN and of finitely many dilations and translations of WN for any N ∈ {2, 4, 6, . . .}. Then there exists an average preserving WFA A computing f . Moreover, if f is specified at resolution 2n then the WFA has 2n + n(N − 2) states. Proof. Refer to the preceding discussion and Figs. 5 to 14 which prove the cases N = 2 and N = 4. It is easy to generalize the proof for any even N . The base function ΦN can be computed by a WFA by restricting it to N − 1 intervals each of which is represented by one state. Therefore in Fig. 7 the WFA will have N − 1 states. One chooses the final distribution of states, as is done in Sect. 4.2.1, which is an eigen-vector of W0 + W1 (sum of the transition matrices of the WFA) corresponding to eigen-value 2. This ensures that (1) holds and the WFA is average preserving. Similarly the wavelet function WN can be computed by a WFA by considering N − 1 intervals. Therefore, in Fig. 10, we will have N − 1 states for the N − 1 intervals of ΦN , another N − 1 states for the N − 1 intervals of WN . Figure 12 can be easily generalized to higher values of N where now we have 2j + (N − 2) dilations and translations of WN at dilation 2j , resulting finally in a generalized WFA in Fig. 13 now with N − 1 trees instead of three, each rooted at the N − 1 states for the N − 1 intervals of WN . The total number of states is (N − 1)2n where 2n is the required resolution up to which f is specified. The wavelet transform coefficients are the initial distribution of the corresponding states. The number of states can be further reduced, as is done in Fig. 14, to 2n +n(N −2). t u The above proof can be generalized to the 2-D case where wavelets are Cartesian products of the 1-D wavelets [9]. As shown in [2], two 1-D WFA can be combined to run in parallel to compute a 2-D function on the unit square which is the Cartesian product of their functions. 6 Application to data and image compression The basic data compression method based on the Wavelet Transform method is simple. One chooses a particular wavelet WN (normally W6 ), computes the wavelet coefficients for the given function f and retains only the coefficients with highest magnitude according to the specified compression ratio [7]. In this section, we describe an altogether different technique to compress data based on the implementation of the Wavelet Transform with WFA. We will describe the algorithm in reference to 1-D Haar wavelets (Fig. 11) though it can be generalized to higher wavelets in a straightforward manner.
364
K. Culik II, S. Dube
Consider Fig. 11. Denote the subtree with h levels and rooted at node v by Subtree(v, h). With each subtree we associate a vector whose elements are the initial weights of the states (nodes) of the subtree represented as an array in the standard manner. For example, denoting for convenience a state (node) by its initial weight we have the following: Subtree(a0,0 , 1) = [a0,0 ] Subtree(a0,0 , 4) = [a0,0 a1,0 a1,1 a2,0 . . . a3,3 a3,7 ] Subtree(a2,1 , 2) = [a2,1 a3,1 a3,5 ] We will view these subtrees as vectors and consider their linear combinations. For example, for subtrees T1 = [a, b, c], T2 = [d, e, f ] and T3 = [ra + sd, rb + se, rc + sf ] we write T3 = rT1 + sT2 . In our description of the algorithm, the subtrees can have different sizes. If we say that Subtree(v, h) is expressed in terms of a bigger subtree Subtree(u, h0 ) where v and u are two states and h0 > h, then it means that we are actually expressing Subtree(v, h) in terms of Subtree(u, h) (which is Subtree(u, h0 ) being considered up to only h levels, the other remaining levels being ignored). Now we are ready to describe the algorithm. Input: A function f on unit interval specified up to resolution p i.e. M data values where M = 2p for some p > 0. Output: A WFA computing f . Algorithm: Step 1: Compute the wavelet coefficients of f and construct the WFA H shown in Fig. 11 with M states and p levels. Let N ← 1 (the index of the last state created so far in the optimized WFA) and i ← 1 (the index of the next state to be processed). Denote the state with initial weight a0,0 as v1 . Let T = {Subtree(v1 , p)} and denote Subtree(v1 , p) by T1 . Step 2: Process state vi : Let ar,s be the initial weight of vi . Let h = p − r (i.e. h is the height of vi in the tree). Step 3: For each label a ∈ {0, 1} consider the corresponding child u of vi (u has a transition labeled with a to vi ) and do the following. If T = Subtree(u, h − 1) can be expressed as a linear combination of subtrees in T = {T1 , T2 , . . . , TN } as T = k1 T 1 + k 2 T2 + . . . + k N T N and of nonzero constants among kj ’s is less than the size of T i.e. P the number h−1 1 < 2 − 1, then replace T by the above linear combination as follows: kj / =0 1. Delete the entire subtree T . state vj . Then add an edge (transition) 2. Let kj be nonzero. Let Tj be rooted at √ from vj to vi with label a and weight 2kj . Else let T ← T ∪ {T }, N ← N + 1 and denote u by vN , T by TN . Step 4: Let i ← i + 1. If i ≤ N then go to Step 2. Step 5: Output the optimized WFA. To carry out Step 3, one keeps an orthogonal basis for the linear space generated by the vectors (subtrees) in T using Gram-Schmidt orthogonalization method from linear algebra. Note that at every step, each of these vectors consists of 2h−1 − 1
Daubechies wavelet transform with weighted finite automata
1.5
1.5
1.0
1.0
0.5
0.5
0.0
0.0 0.0
0.2
0.4
0.6
0.8
0.0
365
0.2
0.4
0.6
0.8
Fig. 16. Compression of a 1-D smooth function
Fig. 17. Compression of a smooth and fractal-like image
elements where h is the height of the state vi currently being processed. Therefore, when the value of h gets decremented or when a new vector is added to T then one needs to recompute this orthogonal basis. In Step 3, we first try to express T as a linear combination of subtrees in T and if it can be done then we check if it really results in some saving of space by comparing the number of new edges with the size of T . Only if fewer edges need to be introduced, we delete T . Example 2. Suppose the input to the algorithm is a set of 16 data values which results in the WFA shown in Fig. 15 (a) after executing Step 1. At this point T = {T1 } where T1 = Subtree(A, 4). We first consider the left child B of state A and the subtree rooted at B. This subtree T2 = Subtree(B, 3) is linearly independent of T1 , therefore we add it to T and proceed to the right child E of A. Now since Subtree(E, 3) = 3/2T √ 2 we delete this subtree and add an edge from B to A with label 1 and weight 3/ 2. We
366
K. Culik II, S. Dube
then process state B and find that Subtree(C, 2) can be expressed in terms of T2√as Subtree(C, 2) = 1/2T2 . Thus we add a self-loop at B with label 0 and weight 1/ 2. A similar optimization is done for Subtree(D, 2). The resulting optimized WFA is shown in Fig. 15 (b). Example 3. In this example we show the performance of our data compression algorithm on some actual data. We used WFA implementing Haar wavelet (W2 ) transform. In Fig. 16 at the left, we have shown a smooth 1-D function plotted at 16K points. In the same figure at the right, we have compressed the same function with our algorithm for the compression ratio 100. Note that in this case the regenerated function is almost identical with the original one. In Fig. 17, we show the results of our compression method on a smooth and fractal-like 2-D image. The compression ratio is again 100. The original image is shown in the left. In the right side of the figure, we show the result of the algorithm described in this paper. The SNR value is 26.29 dB. The pure Haar wavelet based compression (in which low magnitude coefficients are discarded) gives very poor results on these examples. This example shows that by exploiting the scale invariance of Haar wavelet coefficients for smooth and fractal-like images one can improve Haar wavelet based compression. The compression algorithm described in this paper does not work as well for real-world images as other methods. However, by making the algorithm “recursive” as is done in [4] (which resulted in excellent compression results), it is expected that the algorithm would perform well. The experiments point out that for fractal-like and smooth images the wavelet transform coefficients exhibit scale-invariance which should be exploited in compression. References 1. K. Culik II, S. Dube: Rational and Affine Expressions for Image Description, Discrete Applied Mathematics 41, pp. 85-120 (1993). 2. K. Culik II, J. Karhum¨aki: Finite Automata Computing Real Functions, SIAM J. on Computing 23, 4, pp. 789-814 (1994). 3. K. Culik II, J. Kari: Image Compression Using Weighted Finite Automata. Computer and Graphics 17(3), pp. 305-313 (1993). 4. K. Culik II, J. Kari: Inference Algorithms for WFA and Image Compression, Fractal Image Compression: Theory and Techniques, edited by Yuval Fisher, Springer Verlag, pp. 243-258 (1994). 5. K. Culik II, J. Kari: Image-data Compression Using Edge-Optimizing Algorithm for WFA Inference, Journal of Information and Management 30, 6, pp. 829-838 (1994). 6. I. Daubechies: Orthonormal Basis of Compactly Supported Wavelets, Communications on Pure and Applied Math. 41, pp. 909 - 996 (1988). 7. R.A. DeVore, B. Jawerth, B.J. Lucier: Image Compression through Wavelet Transform Coding, IEEE Transactions on Information Theory 38, pp. 719-746 (1991). 8. S. Mallat: Multiresolution Approximation and Wavelets, Transactions of American Mathematical Society 315, pp. 69-88 (1989). 9. G. Strang: Wavelets and Dilation Equations: A Brief Introduction, SIAM Review 31, 4, pp. 614-627 (1989).