www.scichina.com info.scichina.com www.springerlink.com
A novel strong tracking finite-difference extended Kalman filter for nonlinear eye tracking ZHANG ZuTao1,2† & ZHANG JiaShu1† 1
School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China;
2
Sichuan Key Lab of Signal and Information Processing, Southwest Jiaotong University, Chengdu 610031, China
Non-intrusive methods for eye tracking are important for many applications of vision-based human computer interaction. However, due to the high nonlinearity of eye motion, how to ensure the robustness of external interference and accuracy of eye tracking poses the primary obstacle to the integration of eye movements into today’s interfaces. In this paper, we present a strong tracking finite-difference extended Kalman filter algorithm, aiming to overcome the difficulty in modeling nonlinear eye tracking. In filtering calculation, strong tracking factor is introduced to modify a priori covariance matrix and improve the accuracy of the filter. The filter uses finite-difference method to calculate partial derivatives of nonlinear functions for eye tracking. The latest experimental results show the validity of our method for eye tracking under realistic conditions. strong tracking finite-difference extended Kalman filter (STFDEKF), eye tracking, extended Kalman filter (EKF), suboptimal fading factor
1 Introduction The use of eye tracking has significant potential to enhance the quality of everyday human-computer interaction, which was first introduced by Mowrer in 1936. For instance, researchers have utilized eye tracking to study behavior in such domains as driver fatigue detection[1−3] , eye typing for helping users with movement disabilities interact with computers[4] , eye tracking analysis of user behavior in WWW search[5] , using eye tracking techniques to study collaboration on physical tasks for medical research, VR system for measuring inspection
methods, and image scanning[6] . Above all applications, two types of human-computer interfaces utilize eye tracking, passive and active interfaces. Passive interfaces monitor the user’s eye movements and automatically adapt themselves to the user. For example in driver fatigue detection, the researchers detect driver fatigue based on eye tracking, because the human eyes express the most direct reaction at the time of people’s dozing, inattention and yawning. On the other hand, Active interfaces allow users to explicitly control the interface through the use of eye movements. Eye typing
Received April 19, 2008; accepted January 13, 2009; published online March 2, 2009 doi: 10.1007/s11432-009-0081-1 † Corresponding author (email:
[email protected],
[email protected]) Supported by the National Natural Science Foundation of China (Grant No. 60572027), the Outstanding Young Researchers Foundation of Sichuan Province (Grant No. 03ZQ026-033), the Program for New Century Excellent Talents in University of China (Grant No. NCET-05-0794), and the Young Teacher Foundation of Mechanical School (Grant No. MYF0806)
Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694
has users look at keys on a virtual keyboard to type instead of manually depressing keys as on a traditional keyboard[4,7] . Such active interfaces have been quite effective at helping users with movement disabilities interact with computers. Not surprisingly, eye tracking has attracted interests of many researchers, and eye trackers have been commercially available for many years[1,2,7,8] . In the past decades, many researchers have paid attention to the eye tracking in human-computer interaction. In ref. [9], all of these eye tracking algorithms can be classified into two approaches: feature-based and model-based approaches. Feature-based approaches detect and localize image features related to the position of the eye. A commonality among feature-based approaches is that a criterion (e.g., a threshold) is needed to decide when a feature is present or absent. The determination of an appropriate threshold is typically left as a free parameter that is adjusted by the user. The tracked features vary widely across algorithms but most often rely on intensity levels or intensity gradients. For example, in infrared (IR) images created with the dark-pupil technique, an appropriately set intensity threshold can be used to extract the region corresponding to the pupil. The pupil center can be taken as the geometric center of this identified region. The intensity gradient can be used to detect the limbus in visible spectrum images[10] or the pupil contour in infrared spectrum images[11] . In refs. [1,12–14], several active IR based eye trackers were proposed. The authors thought that eye tracking based on the active remote IR illuminations is a simple and effective approach. But most of them require distinctive bright pupil effect to work well because they all track the eyes by tracking the bright pupils. Ji et al. have also made significant improvement of eye tracking over existing techniques[1,11] . However, their methods need IR eye detector, or bright pupils and steady illuminations. Their eye tracking method using Kalman filtering is linear system estimation algorithm. In realistic driving environments, the eyes motion is the high nonlinearity of the likelihood model, the stand Kanlman filter is no longer optimal.
On the other hand, model-based approaches do not explicitly detect features but rather find the best fitting model that is consistent with the image. For example, integrodifferential operators can be used to find the best-fitting circle[15] or ellipse[16] for the limbus and pupil contour. Chau and Betke[17] use the correlation with an online template into eye tracking. The authors in ref. [2] use the dynamic templates for eye tracking. After the eye templates being found, they are used for eye tracking by way of template matching. And the minimum value within the search area is the most matching position of the eye. The model-based approach can provide a more precise estimate of the pupil center than a feature-based approach given that a featuredefining criteria is not applied to the image data. Eye tracking has not reached its full potential even though the general-purpose eye tracking technology has been explored for decades. The first obstacle to integrating these techniques into humancomputer interfaces is that they have been too expensive for routine use. Currently, a number of eye trackers are available on the market and their prices range from approximately 5 000 to 40 000 US Dollars[9] . The second factor is that it is very difficult to model eye tracking because of the eye motion’s high nonlinearity. The third factor is the robustness of eye tracking should be improved because of the variety of head and eyes moving fast, external illuminations interference and realistic lighting conditions. The accuracy of eye tracking cannot satisfy the realistic requirement of HCI. To tackle some of those problems, we propose a strong tracking finite-difference extended Kalman filter algorithm to eye tracking. In this paper, a strong tracking factor is introduced to modify a priori covariance matrix to improve the accuracy of the algorithm. And the finite-difference method is presented to calculate partial derivatives of nonlinear functions for eye tracking. At the same time, we overcome the problem in the eye tracking modeling in nonlinear system. The lastest experimental results show that the average correct rate of eye tracking can achieve 99.4% in three videos. The remainder of this paper is as follows. Section 2 gives the strong tracking finite-difference ex-
ZHANG ZuTao et al. Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694
689
tended Kalman filter algorithm. Section 3 gives STFDEKF-based eye tracking algorithm and experimental results. Some conclusions are drawn in section 4.
2 Strong tracking finite-difference extended kalman filter Extended Kalman filter (EKF) is one of the most common and popular filtering approaches in nonlinear target tracking and state estimation. It includes state estimation of a nonlinear dynamic system, parameter estimation for nonlinear system identification and dual estimation where both states and parameters are estimated simultaneously. EKF simply linearizes all nonlinear functions to the first order by using the Talyor series expansions. At the same time, EKF may cause more errors for the nonlinear system while estimating system state and its variance. Moreover, the linearization may lead to divergence of filtering process. In a nonlinear mismatched model and limited applications scope, EKF filter will lead to the divergence problem of state estimation. For these reasons, two improved EKF algorithms are introduced to tackle some of those problems. 2.1
The extended Kalman filter is based on the assumption that sensor noises and propagation errors are driven by zero-mean, Gaussian-distributed, white, and random processes. Retaining only the first-order terms in the Taylor series expansion, one obtains ⎧ ⎪ xk , uk , qk ) + Fx (k)(xk − x ˆk ) xk+1 ≈ f (ˆ ⎪ ⎪ ⎪ ⎨ +Fv (k)(vk − qk ), (4) ⎪ xk , rk ) + Gx (k)(xk − x ˆk ) yk ≈ g(¯ ⎪ ⎪ ⎪ ⎩ +Gw (k)(wk − rk ), where Fx (k) and Fv (k) are the partial derivatives of f (·) to x and v; Gx (k) and Fw (k) are the partial derivatives of f (·) to x and w. So the suboptimal fading extended Kalman filter (SFEKF) is deduced as follows: The predicted state estimation equations are
Suboptimal fading extended Kalman
filter
xk , uk , vk ), x ¯k+1 = f (ˆ
(5)
xk , rk ). y¯k = g(¯
(6)
The predicted covariance estimation equation is
In this section, an adaptive extended Kalman filter, a suboptimal fading extended Kalman filter (SFEKF), is presented. The derivation of the filter is presented in refs. [18, 19] in detail. SFEKF has the following good properties: 1) low sensitivity to the statistics of the initial states and the statistics of the system and/or measurement noise, 2) strong tracking ability to the suddenly changing states and bias no matter whether the filter operates in dynamic or stationary fashion, 3) acceptable computational complexity. Consider a class of nonlinear discrete-time dynamical system, xk+1 = f (xk , uk , vk ), yk = g(xk , wk ),
P¯k+1 = λ(k + 1)Fx (k)Pˆk Fx (k)T + Fv (k)Qk Fv (k)T ,
(7)
where λ(k + 1) 1 is the suboptimal fading factor used to fade the bypast datum and to adjust predictable state estimation covariance matrix. With the model in ref. [20], λ(k + 1) can be directly determined as λ0 , λ0 1; (8) λ(k + 1) = 1, λ0 < 1; where
(1)
λ0 = tr[N (k + 1)]/tr[M (k + 1)],
(2)
N (k + 1) = V0 (k + 1) − Gx (k)Fv (k) · Fv (k)
where xk is the state vector, yk is the measurement vector, uk is the control input vector, vk is process 690
noise, and wk is measurement noise. vk and wk are statistically independent. The equations of mean and covariance are as follows: E[vk ] = qk , cov[vk , vj ] = Qk δ(k − j), (3) E[wk ] = rk , cov[wk , wj ] = Rk δ(k − j).
(9) T
− Gw (k)Rk Gw (k)T , M (k + 1) = Gx Fx (k)Pˆk FxT (k)GT x (k).
ZHANG ZuTao et al. Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694
(10) (11)
1 γj γjT k j=1 ⎧ T ˆ ⎪ ⎨Gx (0)P0 Gx (0) + Gw (0) , k = 0; k
V0 (k + 1) =
=
ρV (k) + γj γj ⎪ ⎩ 0 , 1+ρ T
k 1;
(12) 0 ρ 1 is the preselected forgetting, which may be selected according to the real processes. For fast changing processes, a smaller ρshould be selected, and vice versa. Like that pointed out in ref. [20], λ(k + 1) is insensitive to the value of ρ. 2.2
Strong tracking finite-difference
Gw (k)Sw = Syw = {(gi (¯ xk,rk + hsw,j − gi (¯ xk,rk − hsw,j ))/2h)}.
The predicted covariance matrix, gain matrix and covariance estimate of suboptimal fading extended Kalman filter (SFEKF) are mended as follows: P¯k+1 = λ(k + 1)Fx (k)Pˆk Fx (k)T + Fv (k)Qk Fv (k)T = λ(k + 1)Fx (k)Sˆx SˆxT Fx (k)T + Fv (k)Sˆv SˆvT Fv (k)T T T = λ(k + 1)Sxˆx Sxˆ x + Sxv Sxv ;
extended Kalman filter Following the ideas in refs. [21, 22], we propose a finite-difference method to replace partial derivatives of nonlinear functions. By further improving the self-covariance and between-covariance, we obtain the algorithm based on strong tracking filterdifference enhanced Kalman filter. We adopt Cholesky to decompose Qk , Rk , P¯K , Pˆk , ⎧ T ⎨ Qk = SvSvT , Rk = SwSW , (13) ⎩ P¯ = S¯ S¯T , P = Sˆ SˆT . k x x k x x The central difference of partial derivative in nonlinear function Fx (k)is expressed as
(18)
(19)
Kk+1 = P¯k Gx (k)T [Gx (k)P¯k Gx (k)T −1 + Gw (k)Rk GT w (k)]
= S¯x S¯xT (Syx¯ Sx−1 )T [Syx¯ SyTx¯ + Syw¯ SyTw¯ ]−1 = S¯x SyTx¯ [Syx¯ SyTx¯ + Syw¯ SyTw¯ ]−1 ;
(20)
Pˆk+1 = [I − Kk+1 Gx (k)P¯k+1 ] = S¯x S¯xT − Kk+1 Gx (k)S¯x S¯xT T = S¯x S¯xT − S¯x S¯yTx¯ Kk+1 − Kk+1 S¯yTx¯ S¯xT T + S¯x S¯yTx¯ Kk+1
= S¯x S¯xT − S¯x S¯yTx¯ − Kk+1 S¯yTx¯ S¯xT
Fx (k) = {fij }
T T T + Kk+1 S¯yx¯ S¯yTx¯ Kk+1 + Kk+1 Syw Syw Kk+1
= {(fi (xk,j + Δxk,j , uk , qk − f (xk,j − Δxk,j , uk , qk ))/2Δxk,j )}, (14) sx,j , h is the step adjustment cowhere Δˆ xk,j = hˆ efficient; and sˆx,j represents the j column of sˆx . Then, we have ¯ Fx (k)Sˆx = Sx x = {(f (ˆ xk + hsx,j , uk , qk ) xk + hsx,j , uk , qk ))/2h}. (15) − fi (ˆ Fv (k)Sv = Sxv = {(fi (ˆ xk , uk , qk , +hsv,j ) xk , uk , qk − hsv,j ))/2h}, (16) − fi (ˆ Gx(k)S¯ + x = Syx¯ = {(gi (¯ xk + h¯ sx,j,rk ) xk − h¯ sx,j,rk ))/2h}, − gi (¯
(17)
= [S¯x − Kk+1 S¯yx¯ Kk+1 Syw ] × [S¯x − Kk+1 S¯yx¯ Kk+1 Syw ]T .
(21)
From above-mentioned deduction, we can infer that all the above calculations include the process noise impact and the error problem of model linearization. The step number linearized by nonlinear function also changes with the last covariance matrix, process noise and observation noise. The filter becomes much simpler through replacing partial derivatives calculation using finite-difference value. The new strong finite-difference Kalman filter (STFDEKF) has high accuracy and covariance estimation, and improves the robustness of target tracking. The latest experiment results show that the STFDEKF can be used in the high nonlinear stochastic systems such as eye tracking.
ZHANG ZuTao et al. Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694
691
3 STFDEKF-based eye tracking and results In this section, we develop the following eye tracking by STFDEKF. Because the eye motion is of high nonlinearity in the likelihood model, it is very difficult to model human eye movement dynamics. In our tracking system, the following nonlinear equations are used to model the eye movement dynamics: 1 (22) x = x0 + vt + at2 , 2 (23) x˙ k+1 = v0 + Ak sin(ωk t), ¨k+1 = Ak ωk cos(ωk t), ak+1 = x
(24)
where the initial value x0 and v0 are zero. The acceleration a follows the sine distribution, and a is considered as process noise (vk )Ak = 0.08 m/s and ωk = π rad/s. The proposed eye tracking experiment is devel-
Figure 1
692
oped in a platform of OPEN CV. Our system uses a ViewQuest VQ680 video camera to capture human images. The experiment is made on a Pentium III 1.7 GHz CPU with 128 MByte RAM. Eye tracking based on the proposed method can reach 10 frame/s. The format of input video is 352×288. Figure 1 represents the eye tracking by STFDEKF algorithm. The correct rate of eye tracking is shown in Table 1. And with the results of Table 2 we evaluated the performance of our proposed method and other eye tracking methods in refs. [1–3, 12]. Correct Rate of eye tracking is defined as follows: Total frames − Tracking failure . Total frames (25) In order to qualitatively gauge performance and discuss the resulting issues, we consider using the traditional measures of performance: the RMSE Correct rate =
Eye tracking by STFDEKF algorithm.
ZHANG ZuTao et al. Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694
Table 1
Result of eye tracking using STFDEKF algorithm Video 1 (without glasses)
Video 2 (glasses)
Video 3 (long hair)
Total frames
1999
2941
2889
Tracking failure
9
16
18
Correct rate
99.45%
99.35%
99.4%
Average correct rate Table 2
Table 3
99.4%
Comparison of eye tracking algorithms Algorithm
Correct rate (%)
Remark
Templates match
99.1
Ref. [2]
Kalman and mean shift algorithm
99.1
Ref. [12]
EKF tracking algorithm
99.0
STFDEK algorithm in this paper
99.4
Refer to Table 1
RMSE and MSE of eye tracking filtering algorithms Algorithm
RMSE
MSE
Kalman filter algorithm
0.13155
0.164661
EKF tracking algorithm
0.1222
0.0904
STFDEK algorithm in this paper
0.0989
0.0780
This paper proposes a new eye tracking method of using strong finite-difference Kalman filter. Firstly, strong tracking factor is introduced to modify a priori covariance matrix to improve the accuracy of
the eye tracking algorithm. Secondly, the finitedifference method is proposed to replace partial derivatives of nonlinear functions to eye tracking. From the above deduction, the new strong finitedifference Kalman filter becomes much simpler because of the replacement of partial derivatives calculation by finite-difference value. The lastest experiment results show that STFDEKF has high accuracy and covariance estimation, thus improving the robustness of target tracking, and can be applied to high nonlinear stochastic systems such as eye tracking.
1 Ji Q, Zhu Z W, Lan P L. Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE Trans Veh Technol, 2004, 53(4): 1052–1068 2 Horng W B, Chen C Y, Chang Y, et al. Driver fatigue detection based on eye tracking and dynamic template matching. In: Proceeding of the 2004 IEEE International Conference on Networking, Sensing & Control. Taipei: IEEE Press, 2004.
7–12 3 Dong W H, Wu X J. Driver fatigue detection based on the distance of eyelid. In: Proceedings of the VLSI Design & Video Technology. Suzhou: IEEE Press, 2005. 365–368 4 Majaranta P, Raiha K. Twenty year of eye typing: system and design issues. In: Proceedings of ACM Eye Tracking Research and Applications Symposium. New Orleans: ACM, 2002. 15–
(root mean square error) and MSE (mean square error). The simulation results of RMSE and MSE are listed in Table 3. The results of above experiments indicate that the proposed method has good performance.
4 Conclusion
ZHANG ZuTao et al. Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694
693
22 5 Laura A, Joachims T. Eye-tracking analysis of user behavior in WWW search. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, Sheffield: ACM, 2004. 478–479 6 Noton D, Stark L. Scanpaths in saccadic eye movements while viewing and recognizing patterns. Vision Res, 1971, 11(9): 929–942 7 Takehiko O, Naoki M, Shinjiro K. Just blink your eyes: a headfree gaze tracking system. In: Conference on Human Factors in Computing Systems. Lauderdale: ACM, 2003. 115–122 8 McCarthy D, Riegelsberger J, Sasse M A. Commercial uses of eye tracking. HCI Technical Report, 2005 9 Li D H, Winfield D, Parkhurst D J. A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego: IEEE Press, 2005. 79–86 10 Zhu J, Yang J. Subpixel eye gaze tracking. In: Proceedings of 2002 IEEE International Conference on Automatic Face and Gesture Recognition. Washington: IEEE Press, 2002. 124– 129 11 Ohno T, Mukawa N, Yoshikawa A. Freegaze: a gaze tracking system for everyday gaze interaction. In: Proceedings of Eye Tracking Research and Applications Symposium. Louisiana: ACM, 2002. 15–22 12 Zhu Z W, Ji Q, Fujimura K. Combining Kalman filtering and mean shift for real time eye tracking under active IR illumination. In: Proceedings of International Conference on Pattern Recognition. Canada: IEEE Press, 2002. 318–321 13 Ebisawa Y. Unconstrained pupil detecting technique using two
694
14
15
16 17
18
19
20
21
22
light sources and the image difference method. In: Proceedings of Visualization and Intelligent Design in Engineering and Architecture. Southampton: Computational Mechanics Publications, 1995. 79–89 Morimoto C H, Flickner M. Real-time multiple face detection using active illumination. In: Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition. Grenoble: IEEE Press, 2000. 8–13 Daugman J. High confidence visual recognition of persons by a test of statistical independence. IEEE Trans Pattern Anal, 1993, 15(11): 1148–1161 Nishino K, Nayar S K. Eyes for relighting. In: Proceeding of ACM SIGGRAPH 2004. Orlando: ACM 2004. 704–711 Chau M, Betke M. Real Time Eye Tracking and Blink Detection with USB Cameras. Boston University Computer Science Technical Report, 2005 Zhou D H, Xi Y G, Zhang Z J. A suboptimal multiple fading extended Kalman filter. Acta Autom Sinica, 1991, 17(6): 689–695 Zhou D H. Fault detection and diagnostics for a class of nonlinear systems. Dissertation for the Doctoral Degree. Shanghai: Shanghai Jiao Tong University, 1990 Zhou D H, Sun Y X, Xi Y Z, et al. Extension of Friedland’s separate-bias estimation to randomly time-varying bias of nonlinear systems. IEEE Trans Autom Contr, 1993, 38(8): 1270– 1273 Fan W B, Liu C F, Zhang S Z. Improved method of Strong tracking extended Kalman filter. Contr and Dec, 2006, 21(1): 73–76 Zhou D H, Wang Q L. Strong tracking filter of nonlinear systems with colored noise. J Beijing Inst Tech, 1997, 17(3): 321–326
ZHANG ZuTao et al. Sci China Ser F-Inf Sci | Apr. 2009 | vol. 52 | no. 4 | 688-694