Journal of Shanghai University ( English Edition ), 2002, 6( 1) : 5 5 - 58 Article ID: 1007-6417(2002)01-0055-04
A Modified Vertex-Based Shape Coding Algorithm SHI X u - L i ( ~ ~ ~'] ), ZHANG Zhao- Yang ( ~. ~ ~ ) School of Communication and Information Engineering, Shanghai University, Shanghai 200072, China Abstract This paper proposes a modified shape coding algorithm called modified vertex-based shape coding (MVBSC) to encode the boundary of a visual object compactly by using a modified polygonal approximation approach which uses modified curvature scale space (CSS) theory to extract feature-points. Key words shape coding, MVBSC, CSS.
1 Introduction MPEG-4 is a new multimedia standard and contentbased coding method, in which arbitrary shape an important feature ElI. For realizing content-based coding, two methods, shape coding and texture coding for arbitrary shape objects should be used. T h e r e are two important shape coding algorithms which are verified in MPEG-4 core experiments. One is the contextbased arithmetical shape coding algorithm (CAE) [2] and the other is the vertex-based shape coding algorithm E3'4]. Both of the algorithms have shortcomings Annoying block effects of reconstructed shape in the intra mode and inter mode of CAE is obvious. As for vertex-based shape coding (VBSC), vertex selection algorithm is very complex. So, we use feature-points extraction algorithm instead of vertex selection algorithm of VBSC.
2 Modified Vertex-Based Shape Coding Method 2.1 Basic vertex-based shape c o d i n g a l g o r i t h m The basic VBSC algorithm consists of two main parts: vertex selection and vertex encoding. Using a repeated refinement method, vertex selection can obtain the vertex of a polygon that approximates the original shape contour. Using the octant algorithm, vertex encoding encodes these vertices.
Received Jun. 22, 2001; Revised Sep.28, 2001 Project supported by the National Natural Science Foundation of China (60172020) and the Science Foundation of Shanghai Municipal Commission of Education ($990123) SHI Xu-Li, Ph.D. Candidate, E-mail : shixuli@sina, corn
2 . 1 . 1 Vertex selection Assume a contour is described by the pixel list: If(s) = (x(s),y(s)):l~.s~N
(1)
where N is the number of contour pixels and f (s) is the s-th contour pixel consisting of horizontal and vertical coordinates x ( s ) and y ( s ), respectively. A polygonal approximation of a contour is an ordered list of dominant points along the contour, called vertices. The contour is reconstructed by tracing the straight lines joining each pair of vertices. Depending on the number of vertices used to fit the original contour, various approximate accuracy may be obtained. In the following, these straight lines joining the vertices are called edges and represented by the segment notation [ v k , ok. 11, where 'vk and vk + 1 are the edge bounding vertices that belong to the original contour curve. The edge is also denoted by ~k, as an approximation of the part of the original contour curve between vk and V k . l , denoted by ~k- The horizontal and vertical coordinates of a vertex Vk are denoted x (vk) and y ( 'vk ), respectively. Polygonal approximation is applied to each contour (list of corner points coordinates) separately. The algorithm is basically controlled by a maximum admissible error between the original and the reconstructed contours. Assuming Euclidean distance is defined to measure the e r r o r , the basic polygonal approximation algorithm is the following recursive refinement algorithm[3,41 : (1) Find the pair of contour points that are the most distant from one another along the contour. This defines the main axis of the contour. If the main axis is shorter than the targeted error tolerance, the contour
Journal of Shanghai University
56 is discarded. (2) For each contour curve between two vertices, find the pixel that is the most distant from the approximation segment. If the corresponding error is higher than the tolerance threshold, introduce a new vertex at that position, thus splitting the previous segment into two sub-segments that are recursively processed with the same way (see Fig. 1). If the corresponding error is lower than the tolerance threshold, the recursive splitting process is terminated and the next segment along the contour is processed. (3) The procedure is repeated until the entire contour has been processed, resulting in an approximate polygon within a maximum error from the original contour. The error is lower than the maximum toler-
04 Y~Xi the index of octant is 0 , a n d so on. The total indexes of octant are 0 , 1 , 2 , 3 , 4 , 5 , 6 and 7. • Encode the octant index using a 3-bit FLC. • Encode the major component: If the octant = 0 , 3 , 4 or 7 then, encode the x magnitude relative to octant staring point using FIgX bits. If octant = 1 , 2 , 5 or 6, then encode magnitude of y relative to octant starting point using Fig Y bits. • Encode the minor component. If octant = 0 , 3 , 4 or 7, encode magnitude of y relative to the octant starting point using min {FlgY + 1,log 2(1X~I + 1)} bits; if the octant = 1 , 2 , 5 or 6, then encode the magnitude of X relative to the octant starting point using {FlgX +l,log2(IY~ + 1 ) } bits.
30 ~
ance.
2
Vk
i,,
Y
5
Vk+l Fig. 1
Recursive polygonal approximation segment split: segment[ Vk, Vk÷1] is split in P if the computed distance dist exceeds the error toleranc
2 . 1 . 2 Vertex encoding Firstly, two indicators of each contour are determined and encoded in the following way. • Changing the value of k , the number of addresses is N - 1. DetermineX-max and Y-max,and the maximum absolute values of the X~ and Yi, respectively. • According to the table in Ref. [51, select two coordinate indicators, FlgX and Fig Y, which correspond to the smallest number of bits, which can be used to represent FIgX and FIgY, respectively. • Encode the selected FlgX and Flg Y using a 3-bit FLC (fix length code). Secondly, the following steps are performed to encode the relative address (X~, Y~). The major component and the minor component are defined in Ref. [ 5 J. • Determine the octant of (X~, Y~). The octants are illustrated in Fig. 2. If (X~, Y~) satisfy:
Fig.2
2.2
1
/
x 7
6 Index of octant
Modified vertex-based shape coding
2.2.1 Feature-points extraction using modified CSS theory F. Mokhtarian proposed the CSS theory to retrieve the similarity image [6~ . We can view in Eq. (1) as arc length parameter and smooth the contour equation (1) using Gaussian function, that is:
x(s,t): Xo(S) ~ g ( s , t ) , y ( s , t ) : yo(S) ~ g ( s , t )
(2)
where ~ denotes convolution and g ( s , t) is the Gaussian function with a width 3 = ~ / ~ . The smoothed contour has the following properties: (1) Shapes preserve their inclusion order. As a result, two shapes with one inside another will remain so during the smoothing process . ( 2 ) All simple shapes are converted to a convex curve at one stage and to a circular point at the final stage, without any self intersection.
Vol.6
SHI X. L. et al. • A Modified Vertex-Based Shape Coding Algorithm
No. 1 Mar. 2002
( 3 ) The number of curvature e x t r e m a ( c o r n e r s )
the
curvature
of each
point
in
the
smoothed contour at scale t using curvature function:
k(s,t)
=
x~( s, t ) y ~ ( s, t) - x ~ ( s, t )y~( s, t ) ((x~(s,t))Z+(ys(s,t))z),3/2
use the vertex encode algorithm to compress the location information of corners. We only use featurepoints extracted by the modified CSS algorithm instead
strictly decrease during the evolution. (4) Curves shrink under curvature deformation. Calculate
57
(3)
of vertex-points. The other steps are same as that in Section 2 . 1 . 2 .
3 Experimental Results In this section, We present some results obtained
where
using the above theory. Fig. 3 shows the shape con-
0__( x~(s,t)=os x ( s ) ~ g ( s , t ) ) ,
tour and the evolved contours of MPEG-4 standard 352 288 sequence bream by using Eq. ( 2 ) . Fig. 4 shows
02
the extraction and modification of the feature-points of
x~(s, t) = 5~8(x(s) ~ g(s, t))
the shape contours showed in Fig. 1 at different
T h e n the locations of absolute curvature maximum are found, which are the locations of corners.
scales. We present in Fig. 5 a comparison of the MVBSC algorithm proposed in this paper with the CAE of
the loca-
MPEG-4 VM. It shows that the proposed method re-
tions of the corners in smoothed contour are not the
duces coded bits by about 15 % compared with the in-
H o w e v e r , because of the property ( d ) ,
locations of corners in original contour that are needen
tra coding mode of CAE at the same Dn. Dn is the re-
in shape coding. So we modify CSS algorithm by using
sult of the number of mismatch pixels in VOP divided
a coarse to fine strategy. First ,we obtain the corners
by the number of object pixels in the original VOP.
at scale t , that is Pt. Then the corners at scale t - 1
Fig. 6 shows that the reconstructed shape using the
was extract to give Pt- 1. According to ( c ), the cor-
proposed encoder/decoder algorithm has better subjec-
ners of the refined locations satisfy CP t C CPt-1. As-
tive quality than the CAE with the same D , .
sume A ( x A, YA) E Pt. If a point B ( XB, YB) E Pt - 1 satisfiesD(A,B)-mini D ( A , B ~ ) : B i ~ P t _ x l , t h e n correct the location of A to egual the that of B , to give A ( x s , YB), where D ( A , Bi )is the Euclidean distance of A and Bi. Reduce the scale to t - 1 and re(a) Original
peat the tracking algorithm , finally we obtain the original location A ( x0, Y0) of the corner A in original
the points in CPt can be obtained. All corners in CPt form a polygon that approximate original shape contour at scale t. By changing scale to t , t - 1 , etc., we can obtain a series of polygonal approximation of an original contour with different error. With a larger scale, the e r r o r is larger, and the shape can be highly
2.2.2
thereverse is
Encode feature-points
In Section 2 . 1 . 2 , the location of vertex is encode.
(c) t = 2
Fig. 3 Smoothed shape by CSS
contour at scale 0. When all corners locations at scale t are corrected,
compressed, and with a small scale, true.
(b) t = I
(a) t = l Fig.4
4
(b) t = 2
Extract feature-point corner by modified CSS
Conclusion In this paper, We propose a modified shape coding
algorithm.
The compression rate is higher and the
To transmit the corners information to receiver, we
subjective quality of reconstructed shape is better than
must also encode the location of corners . So , we can
CAE in MPEG-4. Obviously, it is simpler than VBSC, and the result of MVBSC is almost same as that of VBSC.
Journal of Shanghai University
58
References
2000 CAE
~:~ 1500
]
"~" 1000 ~'~ 500 0
Fig.5
0
I
2
I
4
I
I
6 8 Dn×100
I
10
12
Comparison of compression rate of MVBSC and CAE with the same Dn
(a) CAE at Dn = 0.0200
(b) CAE at Dn = 0.0342
(c) MVBSC at Dn=0.0197 (d) MVBSC at Dn=0.0349 Fig.6
Compare Subjective Quality of CAE and MVBSC with same D,,
[11
JTC1/SC29/WGll, MPEG99/4960, Coding of Moving Pictures and Audio, ISO/IEC Melbourne, October 1999. E2] Brady N, et al. Shape compression of movingobjects using con-bastext-based arithmetic encoding [J ]. Signal Processing : Image Communication 2000,15 : 601 - 616. [3] Chang Jae-won, et al. A new vertex-based binary shape encoder for high coding efficiency[J]. Signal Processing: Image Communication 2000,15:555 - 584. [4] Kim Jong II, et al. Generalized predictive binary shape coding using polygon approximation [ J ]. Signal Processing : Image Communication,2000, 15:643-663. [5] O'Connell K J. Object-adaptive vertex-based shape coding method [ J ]. IEEE Trans. Circuits Syst. Video Technol. , 1997,7(1) :251 - 255. [6] Mokhtarian F, et al. Robust and efficient shape indexing through curvature scale space [A]. Proceedings of the Seventh British Machine Vision Conference, BMVC' 96 [C]. 1996, 1 : 3 5 - 42. ( Executive editor YAO Yue- Yuan )