Acta ]nformatica 32, 491-507 (1995)
mrmtm
i .
9 Springer-Verlag 1995
Local Hausdorff dimension* H. Jiirgensen 1 and L. Staiger z 1Department of Computer Science, The University of Western Ontario, London, Ontario, Canada N6A 5B7 2Lehrstuhl for Informatik II, Rheinisch-Westf~ilische Technische Hochschule Aachen, Ahornstrasse 55, D-52056 Aachen, Germany Received December 16, 1991 / September 23, 1994
Abstract. We define the notion of local size-measure in metric spaces and derive general properties of local size-measures. Special cases include the local Hausdorff dimension, the local entropy, and the local Kolmogorov complexity. For the case of finite-state and closed oJ-languages we exhibit an algorithm for the approximate calculation of the local Hausdorff dimension using the fact that, in this case, the local Hausdorff dimension and the local entropy coincide.
1. Introduction The Hausdorff dimension on a compact metric space can be thought of as measuring certain aspects of the 'complexity' of the space. It is a global, or rather, worst-case measure of complexity. As such it does not provide detailed information about the space as a whole - the space may be 'simple' nearly everywhere except in 'small areas'; the latter, however, would determine the Hausdorff dimension. In this paper, we introduce a local version of the Hausdorff dimension. Intuitively, for any point in the space, the local Hausdorff dimension measures the complexity of its neighbourhood. As one 'traverses' the space one can observe changes of the local Hausdorff dimension and, thus, differences in the complexity of various parts of the space. Our work was motivated by the r61e of the Hausdorff dimension in fractal based rendering techniques (see [Bal] for instance) and a proposal to use the Hausdorff dimension in image analysis [St3]. In the context of images, the Hausdorff dimension corresponds to the complexity o f structure or texture. In general, the Hausdorff dimension of an image provides little information about the image, however, since it only measures the most complex texture in the image. On the other hand, a localized * This work was supported by the Natural Science and Engineering Research Council of Canada, Grant OGP0000243.
492
H. Jfirgensen and L. Staiger
version of the Hausdorff dimension should reveal changes in the complexity of the structure or texture of the image. In particular, one would expect the local Hausdorff dimension to be discontinuous at object boundaries in the image and could, therefore, try to use the local Hausdorff dimension as one of the tools in image analysis. This idea motivated the work presented in this paper. The proposal of [St3] assumes that the image is black-and-white and is given in the form of an infinite quadtree I. This quadtree can be considered as an co-language over the alphabet consisting of symbols for the four directions. If d is the Hausdorff dimension of this co-language then 2d is the Hausdorff dimension of the image. Using the Hausdorff dimension or its localized version in image analysis would be infeasible without an economical way for its computation. It is well-known that determining the Hausdorff dimension of a space is a very difficult task in general (see [Gal] for some approximation algorithms). Finite-state closed c~-languages are a pleasant exception - indeed the only major class of co-languages we know of for which this computation is easy. This is the consequence of two special properties of such co-languages: For these, the Hausdorff dimension coincides with the entropy; the entropy can be computed easily from the state transition graph of the co-language. The assumption that the co-language be finite-state and closed may be very strong in general; however, at least locally, images seem to be reasonably well approximated by such co-languages. In fact, fractal based rendering techniques (see [Bal]) and some image compression techniques (see [Cull) result from similar considerations. These ideas motivated and guided our development of a theory of local Hausdorff dimension. Starting from an arbitrary size-measure A on a compact metric space - the Hausdorff dimension, the entropy, and the Kolmogorov complexity are special cases of such size-measures - we define the corresponding local size-measure I(~). We then derive general properties of such local size-measures. It turns out that the intuition arising from the image analysis problem is indeed correct. We prove, for instance, that every compact set contains a point at which the local size-measure is maximal (Corollary 3.6) and that, under certain rather weak conditions, the local size-measure at a point is approached continuously from at least one 'direction' (Theorem 3.15). Thus, in the context of image analysis, if we interpret the Hausdorff dimension as the complexity of texture, if the local Hausdorff dimension at a point x is d(x) then there is at least one path approaching x - a sequence of points xl, x 2 , . . , with limi---,o~ xi = x - such that d(x) is the limit of the local Hausdorff dimensions along that path. Therefore, if, in traversing the image, one finds small spots in which the local Hausdorff dimension varies 'significantly' one may just be leaving a compact subset at such a spot, that is, be crossing the edge of an object. For finite-state closed co-languages one again has a rather favourable situation. If two given size-measures coincide on every finite-state closed co-language then also their localized versions coincide at every point (Theorem 3.18). Thus, in particular, the local Hausdorff dimension, the local entropy, the local subword complexity, and the local Kolmogorov complexity coincide for such co-languages (Corollary 3.19). This implies that, in order to compute any of these local size-measures, one only needs to compute the local entropy. The computation of the local entropy is based on ideas similar to those used for the global entropy; several rather subtle changes 1 The first assumption is a technical simplification and not essential. The second assumption is essential, but also valid in many applications; even when an image is given as a finite object, a finite quadtree say, one often implicitly assumes the existence of an algorithm for extending this quadtree to arbitrary resolutions resulting, in the limit, in an infinite quadtree.
Local Hausdorff dimension
493
from those are required, however; this is explained in Theorems 4.1 and 4.2 and their derivations. In this paper we focus on the theory of local size-measures and an algorithm for their computation in the case of finite-state closed co-languages. The example of image analysis serves as motivation and guide for the intuition. We are still investigating whether this particular application is feasible and useful. This paper is structured as follows: In Sect. 2 we introduce the notation and the basic notions needed in the sequel; we also quote five results, needed in the sequel, from the literature. In Sect. 3, we introduce local size-measures and derive the fundamental properties. In Sect. 4, these results are used to provide an algorithm for the computation of the local Hausdorff dimension in the case of finite-state closed co-languages. Section 5 contains a few concluding remarks.
2. Basic notions and notation
In this section we review some basic notions in order to establish our notation and the required background. I~ denotes the set of non-negative integers. An alphabet is a finite non-empty set. In the sequel, any alphabet considered is assumed to have at least 2 elements. Let be an alphabet. Then Z* is the set of words over Y2, that is, the free monoid generated by S with concatenation as the operation. The empty word is denoted by A. For a word w, lw 1 denotes its length. A language is a subset of ~*. For a language M , M + denotes the language U~>l M i where M i = M for i = 1 and M i = M M i-1 for i > 1. Moreover, M* = M + U {A}. Y7~: denotes the set of co-words, that is, infinite words over Z. An co-language is a subset of ~ o . Let Z ~ = Z* U Z ~. An oo-word is an element of Z ~ , and an cxD-language is a subset of Z ~ . For a n y M C _ Z ~ and a n y n E N , p r e f ~ M = { w [ w E Z ~ , w S ~ A M 5r is the set of prefixes of length n of M , p r e f M = U pref~ M n_>0 is the set of all prefixes of M , and infix M = { w [ w ~ S ~, Z * w Z ~ f3 M ~ 0} is the set of all infixes (or subwords) of M . The set Z7~ can be considered as a metric space ( Z ~, 6) with metric ~ defined by 0(~,~) = i n f { r - l W l [ w E S * , ~,rl C w Z ~} where r = [E I. This metric space is homeomorphic with the Cantor discontinuum, hence compact. The open balls are the sets of the form w Z ~~ Such a ball has radius r -Iwl. In S ~ every open ball is also closed. Let M be an oo-language over Z and w E S*. The oo-language w [ - 1 ] M = { ~ l ~ E 2~~176w~ E M }
494
H. Jtirgensen and L. Staiger
is called a state of M . M is said to be finite-state if it has only finitely many states. The structure function SM : N --+ N is defined by
sM(n) = IM n sn] + Iprefn (M C~S~)I. Then
HM:[o
f lim supn__+o~ (~ log SM(n)) , '
if M is infinite, if M is finite,
is called the entropy of M and HM = ~ liminfn~ [ O,
(-} lOgSM(n)),
if M is infinite, if M is finite,
is called the lower entropy of M. Here and in the sequel, logarithms are taken with respect to base r = [S] unless indicated otherwise. Note that, for M _ S ~, H M and HM are equal to the lower and upper Minkowski dimensions of M in the metric space ( Z ~, Q). For ~ C S ~~ one has Sinfix~(n + m ) ~ Sinfixr
9 8infix~(m)
and thus Hinfix ~ -- Hinfix ~.
Therefore, we refer to as the subword complexity of ~. The corresponding notion for co-languages is defined as
r ( M ) = sup 7"(~) ~cM
where M _C S ~. We also consider the program size complexity of w-words and w-languages (see [Call, [Chal], [Lil], [St5]). Consider a universal algorithm U within some given model of computation. For ~ E S ~ let K ( ~ [ n) be the length of the shortest program for U which will print the prefix of length n of ~. Then K(~ In) ~(~) = lim sup - n--~ o43
n
is called the relative program complexity of ~ and _~(~) = lira inf K ( ~ I n) n ----+cx)
n
is called the lower relative program complexity of ~. For an co-language M one defines e;(M) = sup t~(~) ~EM
and
~ ( M ) = sup ~(~). ~EM
Obviously, ~ < ~. Note that the actual choice of the universal algorithm U does not matter in this context 2. 2 It also does not matter in our context whether this program complexity is measured in the sense of Kolmogorov (see [Lill) or Chaitin (see [Chl], [Call).
Local Hausdorff dimension
495
Let 32 = (X, O) be any metric space, where X is the underlying set and 0 is the distance function. For any subset M of X , W M denotes the closure of M and diam M denotes the diameter of M where diam M = sup{Lo(x, x') I x, x' E M } . For a point x and a real number c > 0, K~(x) is the open ball of radius e around x, that is, the set of all points x I such that ~)(x, J ) < e. For a subset M of X and an e > 0 let ~ " = {Vi [ i E I } be a countable family of open balls where I is some countable index set such that diam Vi _< e for all i E I and M C_ UiEx V/. Such a family ~ is called an e-cover of M. It is minimal if no proper subfamily of it is an e-cover of M. Let ~3(M, e) be the class of e-covers of M. Now for any e-cover ~ " of M and any c~ E [0, co), let
L~(M, 7/') =
E(diam VO~. iEI
The c~-dimensional outer measure of M is defined as
L~(M) = ~im~ {inf{L~(M, ~ )
I~
E ~ ( M , e)}}.
Now consider L~(M) as a function c~. Then there is an c~(M) E [0, oc] such that
Lc~(M) =
{c% 0,
i f a < c~(M), if c~ > c~(M).
This number a ( M ) is called the Hausdorff dimension dim M of M , that is, the Hausdorff dimension of M is given by d i m M = sup(c~ ] L~(M) = ee} = inf{c~ I L~(M) = 0}. Note that dim is monotone and subadditive ([Bil], [Fall), that is, M C_ M ~ c_ X ~
d i m M _< d i m M 1
and dim U Mi = sup dim Mi iEI
iEI
for any countable family of sets M~ c_ X . In the special case of X = Z ~ for some alphabet Z with I~1 -- r _> 2 the open balls of radius not exceeding e are the sets of the form v Z ~ with r-lvl < e. Thus, we need to consider only e-covers with e = r -'~ for some n E N, and every such e-cover ~/" is of the form ~ " = { v Z ~~ I v E V} for some set V C_ ~*. Hence
L~(M, ~ ) = ~
r -~lvl
vEV
and L ~ ( M ) = lim~ (inf{L~(M, ~ ' ) [ ~
= {v2~~
It turns out that dim M E [0, 1] for every M C Z '~.
E V}, V C s
496
H. Jiirgensenand L. Staiger
We review some known properties of H, _H, % n, ~, and dim on X'~, in particular properties concerning their mutual relation and their computation. For a more detailed account of these relationships see [St5]. It is well-known that dim M < H M < HM for every M C_ S ~'. For finite-state w-languages this can be strengthened as follows. Theorem 2.1. [St4] If M is finite-state and closed then d i m M = HM. The program complexity of w-languages is also related to the entropy and the Hausdorff dimension. Theorem 2.2. [Ryl], [St5] For M C_ ~
one has
dim M < _~(M) <_ ~;(M). Moreover, / f p r e f M or ~* \ p r e f M is recursively enumerable then ~(M) < H M
and
~(M) < HM.
If M C_ S~o is finite-state then, in particular, p r e f M is recursively enumerable. This observation yields the first part of the following corollary. Corollary 2.3. [St5] Let M C ~
be finite-state and closed. Then
dim M = ~ ( M ) = n ( M ) = HM. Moreover, there is an w-word ~ E M such that ~(~) = ~c(~) = HM.
There is also a relation between the subword complextity of an w-word ~ and the entropy of finite-state languages containing ~. Theorem 2.4. [Crl], [St5] For every ~ c Z ~ one has r(~) = inf{HM I M C_ r ~ finite-state, ~ C M } . In particular, if M is finite-state and closed, then r ( M ) = HM and there is a ~ C M such that "r(() = HM.
By Corollary 2.3 and Theorem 2.2 this implies
_~(~) _< ~(~) < r(~) and dim M < ~_(M) <_ ~ ( M ) < r ( M ) . In order to calculate the entropy of a finite-state w-language M one considers the adjacency matrix 92M of M. The rows and columns of 9..[M a r e indexed by the nonempty states of M. If M1 and M2 are non-empty states of M then the (M1, M2)-entry of9..[M is equal to ] { x l x E S , M2 = x[-llM1}]. Theorem 2.5. [Chl], [Kul], [Linl], [Stl] If M is a finite-state closed w-language then HM = log c~ where a is the maximum eigenvalue of the adjacency matrix 9XM of M. For additional background information we refer to [Bil], [Bi2], and [Fall for probabibility, information, and fractal geometry, to [Linl], [St2], [Thl], and [Trl] for w-languages, to [Call, [Chal], and [Lil] for program complexity, and to [Wol] for languages and automata.
Local Hausdorff dimension
497
3. Local size measures
In this section we introduce the notion of a local size-measure or local dimension on a compact metric space. We derive fundamental properties of such a local dimension which corroborate the intuition. Moreover, for the case of finite-state closed wlanguages we show that, as with global size-measures, the natural local size-measures coincide. Let 2E = (X, t)) be any compact metric space and let A be a real valued function on X which satisfies the following conditions: (1) A(M) > 0 for M 7( O; (2) A(X) < oe; (3) A is monotone. In the sequel such functions A are referred to as size-measures. As typical candidates for A we consider H , H, r, n, _a, and dim. From A we derive the corresponding local size-measure - or dimension - as follows: For M C_ X, x E X , and e > 0, let
t(XM)(X,e) = A(K~(x) n M). P r o p e r t y 3.1. If M C_ M ' and K~(x) C_ K~,(z') then l~)(x, e) <_ l(XM),(x', e'). Because of this property the following definition makes sense. Definition 3.2. For M C X and x E X,
l(~)(x) = lim l~)(x, e) ~---~0
is the local (M, A)-dimension (or local (M,)@size-measure) at x. Note that the localization operator l (;9 is monotone with respect to A, that is, if A _< A~ then 1(~) _< l (;~'). For X = S "~ with Z an alphabet, this implies for instance that /(dim) < l(n_) <_ l(H) and /(dim) < l(.~) < l(,O < l(,-). For any A one has the equality
L~)(z) = )40) for all x r ~ M . Note also that Property 3.1 implies that l~)(x) = inf,>0 l~)(z, e). We now analyse the behaviour of the local (M, A)-dimension on converging sequences in X. L e m m a 3.3. For i C N let xi C X be such that the limits limi__+~ xi and limi~oo l(~)(xi)
exist. Then lim l~)(xi) <_ l~)( lim xO.
i--+oo
498
H. Jiirgensen and L. Staiger
Proof Let c~ = l i m i ~ l~)(zi) and z -- limi__,~ xi. Consider (5 > 0. Then l~)(xi) > c~ - (5 for almost all i, hence I~)(x~, c) > c~ - (5
for all e > 0 because of Property 3.1. For fixed (5 > 0 and c > 0 let i0 be large enough so that l~)(xi) > c~ - (5 and ~o(x, xi) < e / 2 for all i > i0. Then Kc/2(xi) c_ K e ( x ) and thus l(XM)(X,c) > c~ -- (5. Therefore, l~)(x) > ~.
[]
Corollary 3.4. Let xi, x 2 , . . , be a converging sequence in Z. Then l i m s u p / ~ ) ( x 0 < / ~ ) ( lim x 0 . i ---~ CX)
Z--+O~
The property of l ~ ) established by Corallary 3.4 turns out to be characteristic for real-valued mappings of 3~ whose inverse images of sets of the form [a, oc) are closed. T h e o r e m 3.5. X = (X, ~o) be a metric space and let 1 : X statements are equivalent:
--~ I~. The following
(1) If xl, x 2 , . . , is a converging sequence in ~ then limsupl(xi) <_ l(lim xi). i---~ (x3
Z--+OO
(2) The set l-l([a, oo)) is closed f o r every a E •. The proof uses standard techniques and is left to the reader. The class of realvalued mappings characterized by Theorem 3.5 has the following property. Corollary 3.6. Let ~ = (X, p) be a metric space and let l : X ~ ~ be a mapping which satisfies lim supi__.~ l(xO < / ( l i m i ~ c ~ xi) for every converging sequence in 3~. Then every compact subset F of 2~ contains a point y such that l(x) <_ l(y) f o r all xEF.
Proof Let a = s u p z c g l(x) and consider a sequence xl, x2, 9 E F such that l(xi) >_ a - 1/i. Since F is compact, this sequence contains a subsequence converging to some y E F. Hence, l(y) <_ a and, by the assumption, l(y) = a. [] Corollary 3.7. If l ~ ) is a local (M, ,~)-dimension, then, for all a c
IR, the set
/~)-1 ([a, oo)) is closed.
Proof This an immediate consequence of Theorem 3.5 and Corollary 3.4.
[]
We illustrate the intuitive meaning of Corollary 3.7 using our example of image analysis. Assume that A is the Hausdorff dimension. Then the l ~ ) is the local Hausdorffdimension which can be thought of as measuring the complexity of texture locally. Consider a threshold a E R. Then the set [l(M~)]-l([a, cx?)) is the set of those image points at which the texture complexity is no less than the threshold a. This set is closed. Hence, if an object in the image is characterizable by its surface texture in contrast to the rest of the image one would expect to be able to find its boundary and the object itself by manipulating such closed sets.
Local Hausdorff dimension
499
Corollary 3.8. If l ~ ) is a local (M,),)-dimension, then ever), compact set contains a point at which l ~ ) is maximal in the set.
Proof This is an immediate consequence of Corollary 3.6 and Corollary 3.4.
[]
By definition, the local dimension never exceeds the global dimension. We now show that under certain conditions a set will contain a point at which the local (M,),)dimension has the same value as the global dimension ),(M). Intuitively, as the global dimension is a worst-case size-measure, these points determine the value of the global dimension. Theorem 3.9. Let M C_ X be non-empty and such that C~M is compact and A(P U Q) = max{)`(P), )`(Q)}
for any P, Q c_ W M. Then there is a point y c ~ M with l~)(y) = )`(M). Proof Since )`(M) _> l~)(y) >_ )`(9) the assertion is true for every y C M if )`(M) = )`(0). Hence, assume that ),(M) > ),(9). We define a sequence M1, M 2 , . . . of nonempty closed subsets of ~ M such that A(Mi n M ) = A(M), Mi+z C_ Mi, and limi-~o~ diem Mi = 0. Let M0 = ~ M . Assume that Mi is given. As ~ M is compact also ~ ( M i N M) is compact. Hence, ~ ( M i n M ) has a finite cover by balls of radius r = 2 -i. From the assumption that A(P U Q) = max{),(P), ),(Q)) for any P, Q c_ ~ M
it follows that this cover contains a b a l l / ( ~ (xi) with ),(Mi n M n K ~ (x0) = ),(M~ n M).
By assumption, this is equal to ),(M). Now define M~+~ = ~ ( ~ ( M ~ n M ) n K ~ ( z O ) . Then diamMi+l <_ 5i and M:i n M n / ( c ~ ( x i ) C Mi+l A M C M i N M . Hence ),(Mi+l N M) = ),(M) > ),(9) and thus Mi+l 5~ 9. The compactness of ~ M implies that n Mi 5/ 9. From the fact that diam Mi ~ 0 as i ~ cc it follows that n Mi = {y} for some point y E ~ M . Every open ball containing y contains infinitely many of the sets Mi. Hence
l~)(y) > ),(M).
[]
For subsets F C M we can derive the following upper bound on ),(F) using the local (M, ),)-dimension l ~ ). Corollary 3.10. Let M C X be such that ~ M is compact and A(P U Q) = max{)`(P), )`(Q)}
for any P, Q c_ ~ M . For any F c_ M one has
A(F) _<
max l~)(x).
xE~F
500
H. Jfirgensenand L. Staiger
Proof Theorem 3.9 and Corollary 3.6 imply that A(F) < max l~)(x) <_ max t~)(x) xG~- F xE~F because of F C_ M.
[]
As a special case of Corollary 3.10 one obtains that A({x}) _< I(A) M (x) for all x E M. The assumption in Theorem 3.9. and Corollary 3.10 is in general not true as shown for the lower entropy in the following example. Example 3.11. Consider the two recursive functions f , g : N --~ N with f ( n ) = ~ O, ( (2n-1).(2n-1)!,
if n = O, if n > 0 ,
and 9(n) = (2n). (2n)!. Hence
n
~i=1 f ( i ) < (2n).v _ 1 1 + ~-]in__l(f(i) + 9(i)) - (2n + 1)[ 2n + 1 and ~1
g(i) l + f ( n + l ) + ~ i = l (nf ( ) + gi( i ) )
< (2n + 1)! _ (2n+2)!
1 2n+2
since ~i .i! = ( n + 1)! - 1. i=o Let X = S ~ where ~ is the alphabet {0, 1 } and consider the w-languages F, E C_ X given by F = I . H Ef(i) . Og(i) and
E = O.
i=l
fI
0 f(O . E g(i).
i=l
One computes that IOgSF
1+
(f(i)+g(i))
=
9=
and lOgSE
(
f(i) -=
l+f(n+l)+
(f(i)+9(i)) i=l
=
9(i), i=l
hence H_F = I-I E = O. On the other hand, because of the special structure of E and F f o r any n > 1 there is an m <_ n such that SFUE(n) = SF(n ) + 8E(n) > 2 m-1 + 2 '~-'~ holds true. Therefore, SFuE(n) :> 2n/2-1,
Local Hausdorff dimension
501
hence
1
HFuE >--~, that is,
H F u E -'/max(HF, HE}. Now consider the set M = E U F. Obviously, M is closed. For every ball w Z ~ with Iw] > 1 one has MnwS~=FnwS
~
or
MNwS~=ENwZ
~.
Consequently,
sup lM(~)(x)=max xE~M
~ sup/ E~_) ( x ) , s u p l y(fi)(x) ~ =max(H___E,H_F}=O< ~1 <_tt M. ~.xEE
xEF
)
This shows that the assumption that A(P U Q) = max{A(P), A(Q)} in Theorem 3.9 and Corollary 3.10 is essential. Moreover, in Theorem 3.9 it is impossible to prove z E M instead of z E ~ M in general as shown in the next example. Example 3.12. Let X = E ~ where ~ is the alphabet (0, 1) and consider the a2languages Mn = ( S n \ {On})"~ and M = U 0n- 1 -M~. n_>0
Note that Mn is finite-state and closed. One has HM~ = dim Mn - log 2 n - 1 < 1 n
and hence HM = dim M = 1. One verifies that l~)(0 ~~ = 1, ~ M \ M = {0~}, and l(XM)(X) < 1 for x E M, if
A E {dim, H}. The assumption of ~ M being compact cannot be omitted in Theorem 3.9. This is shown in the following example. Example 3.13. Consider M = X = [0, oc) with the usual metric, and let A(F) =
{0' s u p { l - 7 1 I x E F},
/ f F C_ [0, 1), otherwise,
for F C X . Then A(X) = 1 and
for any Pi C_ X . On the other hand, for every x E X one has
{ O,
l~)(x) =
1--'~1
/ f x E [0, 1), otherwise,
hence
l~)(x) r ),(x).
502
H. Jiirgensen and L. Staiger
The preceding example can be strengthened to yield a local (M, A)-dimension satisfying l{~}(x) (~) = 0 for all x as follows: E x a m p l e 3.14. Let M = X = [0, ec) with the usual metric, and let A(F) =
0, sup{1 - •x I x c F ~
i f F c_ [0, 1), otherwise,
for F C_ X where F ~ denotes the set of condensation points of F. As usual, a point x is said to be a condensation point of F if every open set S containing x satifies ]S n FI > ~o. Then A ( X ) = 1 and, since x E (Uic~ Pi) | implies x E P~ for some i E N, also the identity A ( i ~ Pi ) = supA(PO~ holds for any Pi C_ X. On the other hand, for all x E X one has {0 l~)(x) =
' - 1,
i f x E [ O , 1), otherwise,
hence
@(z) r ~(x). Moreover, {x} | = 0 and, therefore,
l}~}(z) = 0 for every x E X. T h e o r e m 3.15. Assume that the following conditions are satisfied:
(a) ~ M is compact and has no isolated points; (b) A(Uiei Mi) = suP~EI A(MO for every countable family {Mi ] i E I} of subsets M~ of W M ; (c) A({x}) = A({z'})for all z, x' E X. Then for every z E ~d~Mthere is a sequence x l, x2, . . . E ~ M \ { z } with lim~--.oo xi = z such that lira l~)(xi) = l~)(x). i --* c o
Proof The proof is by contradiction. Let a = l~)(x) and such that ]a - l~)(z~)] > 5 whenever 0 < p(x, x') < e. that I(~)(x ') <_ a - 5 for all z ~ E K~(x) \ {x}. Since x there is an x I E M N (KE(x) \ {x}). Condition (c) and a > a - ~ >_ t(~)(x ') >_
assume that there are 6, ~ > 0 It follows from Corollary 3.4 is no isolated point of ~ M , Corollary 3.10 imply that
;~({x'}) = ~({x}) > ;~(0)
using the monotonicity of A. Now we cover the open set K~/2(x) \ {x} by a countable family {Ki ] i e I} of balls/(~ in such a way that
Local Hausdorff dimension
503
UK~C-K~(x)
and x ~ K ~
icI for all i. By Corollary 3.10 one obtains ),(K~ n M ) < a - 5 for all i, hence, using assumption (b), A ( ~ U ( K i rh 3/I) ) <_ a - 5. Now
/'~e/2(X) n A4 c
{a:} u U(K~ A M ) iEI
implies that )~(K~/2(x) N M ) < a - 5, contradicting such that
l(~)(x) = a. Consequently, there is an infinite sequence xl, xz,.., c X lim xi = x
i---+cxz
and
lim l~)(xO = l~)(x) = a,
i~c~
Since x~ ~ ~ M implies l(~)(x~) = A(~) and since a > A(O) it follows that at most finitely many points x~ are not in ~ M . [] The following examples show that the assumptions (b) and (c) of Theorem 3.15 are actually necessary. E x a m p l e 3.16. Let X = Z ~ where Z is the alphabet {0, 1} and let ,~ be one of n or ~. Let ~ C S ~ be such that n({) = ~_(~) = 1, and let M = S* 90 ~ U {{}. The above condition (c) is not satisfied. One verifies that /~)(/3):{0'
if ~ 4 ~,
1, if~=~, f o r all t3 E ~ . E x a m p l e 3.17. Again let X = S "~ with Z7 = {0, 1}. Let O~
E:fl(Oi'Z'),
Mn=on
. I . Z ~ . E,
and
M:
i=1
U M,~. n~0
M has no isolated points and ~ M = M U {0 ~} with 0 ~ f[ M . With )~ = H one computes that H M >_ 1/2 and, since HM~ = H E = 0 (see [Stl]),
1~)(15) = { O, if ~ r 0 "~, HM _> 1/2, if13 = 0 ~, f o r 13 E S ~~ In this case, the above condition (b) does not obtain as HM,,
=
Ho
=
O.
504
H. Jiirgensen and L. Staiger
We now discuss the relation between the localizations of a, ~, ~-, H , H , dim, and ~- in S ~ where Y7 is an alphabet.
Theorem 3,18. Let J g be a familiy of co-languages over S which is closed under intersections with sets which are finite-state and closed, and let )~ and A t be such that A(M) = A~(M) for all M E Jig. Then
for every M E ~//~ and every ~ E Z ~. Proof. For ~ E ~U~ and n E N, the set K r - ~ ( ( ) is finite-state and closed. Hence, for M E J//d, also the intersection K ~ - . (~) n M is in ~d/~. This implies that A and A~ agree on these sets. Hence their limits for n ~ c~ are the same. [] The family of finite-state closed co-languages is closed under intersection. Hence by the results reviewed in Sect. 2, the localized versions of the Hausdorff dimension, the entropy, the program complexity, and the subword complexity coincide on such co-languages. This proves the following corollary.
Corollary 3.19. Let M C_ Z ~ be finite-state and closed. Then /~m,(~) = lM~,(~)= I(MH)(~)= l~)(~)= l~)(~)= l~)(~) for all ~ E Z ~~ Aside from its intrinsic interest - establishing the relation between various natural local size measures for finite-state closed w=languages, Corollary 3.19 implies that, if any one of them can be computed, then all Of them can. In the next section we show how to compute the local entropy of a finite-state closed co-language. In general, that is, for an arbitrary co-language M , one only has
l(~m)(~) _~ l~)(~)
_~)(~) ~_ l~)(~)
and
/(M~m)(~) ~ IM~')(~) ~ l~/)(~) as noted before.
4. Computing the local entropy In this section we provide an algorithm for computing the local entropy ( E Z "~ with respect to a finite-state closed co-language M. It is based on a generalization of ideas used in the proof of Theorem 2.5 As a consequence of Corollary 3.19 this algorithm also computes the localized versions of the size measures introduced in Sect. 2 including, in particular, the local Hausdorff dimension.
Local Hausdorff dimension
505
Theorem 4.1. Let M be a finite-state closed w-language and let ~ be an w-word. If (prefi ()t-UMis empty for some i E N then l(~)(~) = O. Otherwise, there is a state M~
(
occurring infinitely often in the sequence ,(pref i ~) [-1] M ,"~ k
,1 {EN
and I(MH)(~)= HM~.
Proof One verifies that
n(pref~ ~)S~nM = H(prefl ~)I-UM ~ n(prefi+l ~)[-llM, hence l(MH)(~)= lim H(pref~~)~-~JM.
i~oo
Now one of two { ~ g~M = M, non-empty, then among the states
cases may apply: If (prefi {)[-1]M is empty for some index i then hence l(MH)({) = O. On the ~ hand, if (prefi~)[-llM is always there is at least one state, M~ say, which occurs infinitely often (prefi {)[-11M. Consequently l(mH)(~) = HMr
Note that this statement is independent of the choice of M~.
[]
Thus, the computation of the local entropy at ~ and with respect to the finite-state closed set M can be performed by the following two steps: (1) Find a state M~ which occurs infinitely often among the states ((prefi ~)[-llM)
.
(2) Compute HM~. The feasibility and complexity of the first step depends on the structure of ~. If ( is known to be of the form uv ~ for some u, v E S* then it can be performed in finite time. We now turn to discussing the second step. Consider the following directed multigraph r M with edges labelled by elements of Z: The set of vertices of qSM is the set SM of non-empty states of M. If Q and Qt are vertices there is an edge (Q, Q', z) from Q to Q' with label x if and only if Q' = x t - U Q . Let EM be the set of edges of OM. We write Q ~- Q' if there is an edge from Q to Q'. Let ~-* be the reflexive and transitive closure of ~-. A non-empty subset K of SM is said to be a strongly connected component if, for any two states Q, Q' E K , one has Q ~-* Q, ~_. Q. We extend the relation ~- to strongly connected components as follows: I f K, K ~ are strongly connected components then K ~- K ~ if and only if Q t- Q' for some Q E K and Q' E K r. Again, ~-* is defined as the reflexive and transitive closure of F-. We now number the non-empty states of M according to the following rules:
(a) M I = M . (b) The states in a strongly connected component have consecutive numbers. (c) If states Q and Q' are in different strongly connected components and satisfy Q ~_. Qr then the number of Q is strictly less than that of Q'. Note that this numbering of states induces a numbering of the strongly connected components. Using these rules, the adjacency matrix 9.1M of M will be in upper block diagonal form, that is,
506
H. Jfirgensenand L. Staiger
~M =
0 0
0 iii)
where k is the number of strongly connected components and ~ i corresponds to the transitions within the ith strongly connected component. For any strongly connected component K of states, let ~ K be the submatrix of 9.IM corresponding to the states in K . Note that ~ K is non-negative and irreducible. The value of HM~ is now obtainable using the following theorem which is a consequence of Theorem 2.5. Theorem 4.2. Suppose that M is finite-state and closed, that K is a strongly connected component of r and that Q E K. Then
HQ = logmax{aK, I K ~-* K ' } where O~K, is the maximal eigenvaIue of ~gK,. This result reduces the computation of the entropy of the states of M to the computation of the eigenvalues of certain submatrices of the adjacency matrix 9..tM of M. Corollary 4.3. Suppose that M is non-empty, finite-state, and closed. If n is the number of strongly connected components of CSM then the local entropy with respect to M has at most n different values.
5. Concluding remarks We believe that the Hausdorff dimension as well as similar other size-measures are too global for obtaining meaningful statements about complicated topological objects. We have introduced a localization operator which, given a global size-measure on a metric space, produces the corresponding local size-measure, that is, a function of the space into the real numbers which, essentially, assigns to each point in the space the value of the size-measure in its neighbourhood. We have established some fundamental properties of local size-measures on metric spaces - fortunately in agreement with intuitive expectations - and, for the theoretically important special case of finite-state closed a:-languages, we have shown that the natural local size-measures coincide as is known to hold true for their global counterparts. In [St3] an algorithm for calculating an approximation to the Hausdorff dimension of a black-and-white image given by a quadtree is proposed. In essence, the quadtree is interpretated as an w-language over an alphabet with 4 elements corresponding to the four directions, and its Hausdorff dimension is computed when this language is finite-state and closed. One of the limitations of this idea, that is, that the Hausdorff dimension only describes the worst-case situation in an image, provided the motivation for introducing local size-measures. The local Hausdorff dimension would allow one to distinguish textures and, thus, to determine object boundaries. Moreover, using the local Hausdorff dimension (or another local size-measure), one could draw a dimension map for a given image, that is, a gray-scale image in which every point is drawn with a gray-level corresponding to its local dimension. We are currently exploring this application of our theoretical work by experiments.
Local Hausdorff dimension
507
References [Bal] [Bil] [Bi2] [Call [Chal] [Chl] [Crl]
[Cu 1] [Fal] [Gal] [Kul] [Lil ] [Linl] [Ryl]
[Stl] [St2] [St3]
[St4] [St5] [Thl] [Trl] [Wol]
M. Barnsley: Fractals Everywhere. Academic Press, Boston, 1988. P. Billingsley: Ergodic Theory and Information. Wiley & Sons, New York, 1965. P. Billingsley: Probability and Measure. Wiley, New York, 1986 (2nd edition). C. Calude: Information and Randomness. An Algorithmic Perspective. Springer-Verlag, Berlin, 1994, to appear. G. J. Chaitin: Information, Randomness, & Incompleteness. Papers on Algorithmic Information Theory. World Scientific, Singapore, 1987. N. Chomsky, G. A. Miller: Finite-State Languages. Information and Control 1 (1958), 91-112. E. Creutzburg, L. Staiger: Zur Erkennung regelm~iger Gesetzm~il3igkeiten. In: Strukturerkennung diskreter kybernetischer Systeme. Seminarbericht 82, Sektion Mathematik, Humboldt-Universit~tt, Berlin, 1986. Part II, 342-383. K. Culik, II, J. Kari: Image Compression Using Weighted Finite Automata. Computer and Graphics 17 (1993), 305-313. K.J. Falconer: Fractal Geometry. John Wiley & Sons, Chichester, 1990. L. Garnett: A Computer Algorithm for Determining the Hausdorff Dimension of Certain Fractals. Math. of Computation 51 (1988), 291-300. W. Kuich: Entropy of Tranformed Finite-State Automata and Associated Languages. In: Graph Theory and Computing. Academic Press, New York, 1972, 77-86. M. Li, P. M. B. Vitanyi: An Introduction to Kolmogorov Complexity and Its Applications. SpnngerVerlag, New York, 1993. R. Lindner, L. Staiger: Algebraische Codierungstheorie - Theorie der sequentiellen Codierungen. Akademie-Verlag, Berlin, 1977. B. Ya. Ryabko: Noiseless Coding of Combinatorial Sources, Hausdorff Dimension, and Kolmogorov Complexity. Problemy Peredachi lnformatsii 22 (1986), 3, 16-26 (in Russian; English translation: Problems oflnformation Transmission 22 (1986), 3, 170-179). L. Staiger: The Entropy of Finite-State w-Languages. Problems of Control andlnformation Theory 14 (1985), 383-392. L. Staiger: Research in the Theory of u-Languages. J. Inf. Process. Cybern. EIK 23 (1987), 415439. L. Staiger: Quadtrees and the Hausdorff Dimension of Pictures. In: Geobild "89, Proceedings of the 4 th Workshop on Geometrical Problems oflmage Processing held in Georgenthal, March 13-17, 1989, edited by A. Hiibler, W. Nagel, B. D. Ripley, G. Werner. Akademie-Verlag, Berlin, 1989, 173-178. L. Staiger: Combinatorial Properties of the Hausdorff Dimension. J. Statistical Planning Inference, 23 (1989), 95-100. L. Staiger: Kolmogorov Complexity and Hausdorff Dimension. Information and Computation 103 (1993), 159-194. W. Thomas: Automata on Infinite Objects. In: Handbook of Theoretical Computer Science, B, edited by J. van Leeuwen. Elsevier, Amsterdam, 1990, 133-191. B.A. Trakhtenbrot, Ya. M. Barzdin': Finite Automata: Behavior and Synthesis. Mir, Moscow, 1970 (in Russian; English translation: North-Holland, Amsterdam, 1973). D. Wood: Theory of Computation. Harper & Row, New York, 1987.