Analog Integrated Circuits and Signal Processing, 45, 131–141, 2005 c 2005 Springer Science + Business Media, Inc. Manufactured in The Netherlands.
Single-Chip Eye Tracker Using Smart CMOS Image Sensor Pixels DONGSOO KIM, SEUNGHYUN LIM AND GUNHEE HAN Department of Electrical and Electronic Engineering, Yonsei University, 134, Shinchon-dong Seodaemoon-gu, Seoul, Korea E-mail:
[email protected];
[email protected];
[email protected]
Received January 8, 2004; Revised July 6, 2004; Accepted October 19, 2004
Abstract. The eye tracker is a system that detects the point where the user gazes on. The conventional eye tracker using a Charge-Coupled Device (CCD) camera needs many peripherals and software computation causing high cost, computation time and power consumption. This paper proposes a single-chip eye tracker using smart CMOS Image Sensor (CIS) pixels. The proposed eye tracker does not require additional peripherals and operates at higher speed than the conventional approach. The prototype chip was designed and fabricated for a 32 × 32 smart CIS pixels array with a 0.35-µm CMOS process. The test results show ±1 pixel error at the rate of 125 frame-per-second. The power consumption is 260 mW with 3.3 V supply voltage and the silicon area is 3.8 mm2 . Key Words:
1.
eye tracker, CMOS image sensor (CIS), smart CIS, shrink operation, winner-take-all (WTA)
Introduction
The common pointing devices in a computer system are a mouse and a tablet pen. However, these interfaces are difficult to use in mobile environment or with a Head Mounted Display (HMD) environment. One of alternative pointing devices is an eye tracker that acquires the point on the screen where the user gazes on [1– 3]. Infrared light is commonly used in the eye tracker because it eliminates the influence of ambient illumination and improves discrepancy between the pupil and the white of the eye [4]. Under the infrared illumination, the pupil is the biggest black region in the eye image. Therefore, the point that the eye gazes on can be obtained by finding the center point of the pupil in the eye image A common eye tracker uses a CCD camera and image processing algorithm [3–5]. However, this system needs many peripherals causing high cost. The AD conversion and image processing limit its speed around 30 frame-per-second [1]. The major advantages of the CMOS image sensor (CIS) are low power consumption and compatibility with mainstream silicon technologies. Furthermore, the most important property is the implementation of smart pixels that contain analog, digital, or mixed-signal processing circuits on the same
silicon chip with photosensor array [6–11]. A number of smart pixel cellular neural networks [12, 13] that perform image processing and pattern recognition have been reported in the literature. This paper proposes a single-chip eye tracker that generates the address for the center point of the pupil. Section 2 proposes the eye tracker architecture and operation principle. Section 3 describes circuit implementation. Section 4 presents the simulation and the experimental results. Finally, Section 5 provides the conclusion.
2.
Proposed Eye Tracker Architecture
The proposed eye tracker is composed of smart CIS pixel array, Winner-Take-All (WTA) circuits, address encoders and stop criterion circuit as shown in Fig. 1. The pixel array includes photosensor and inter-pixel feedback circuit that captures the image and performs shrink operation that will be described below. The WTA circuit finds the winning current and the location from the horizontally/vertically summed current from the pixel array. The winning location is encoded by the encoder and it is latched by the trigger signal, VSTOP from the stop criterion circuit.
132
Kim, Lim and Han
Fig. 1. Block diagram of the proposed eye tracker.
Figure 2 shows the flow chart of operation principle that is explained below. The image is captured by photodiode and then transferred as an initial state voltage. During the shrink phase the black region of the image is shrunk by inter-pixel positive feedback. Each pixel generates current, Ii j that is inversely proportional to its state voltage and these current are summed columnwise and row-wise, respectively. These summed current (Ii , I j ) represent how many black pixels are in each column and row respectively. The WTA circuit detects the column and row that has the largest number of black pixel during the shrink operation. The operation 1, 2, 3 and 4 in Fig. 2 are performed simultaneously by the analog circuits. This procedure is continuously repeated until the number of the black pixels is less than 2. When the number of black pixel is less than 2, the winning column and row address is latched and read out. These addresses indicate coordinates of the remaining black pixel that corresponds to the center of the pupil. Once the address is read out then a new image is captured and the above procedure is repeated with the new image. The smart CIS pixels array performs shrink operation with continuous time inter-pixel feedback as shown in Fig. 3. The dynamics of the proposed smart
pixel can be expressed as 1 1 C F d xi j = εkl y(i+k, j+l) , xi j (0) = u i j dt k=−1 l=−1 yi j = f (xi j ) (1) where C F is the integration capacitor and u i j is the initial state of the pixel (i, j) that is captured from the photosensor. The xi j and yi j represent the state voltage and the output current of pixel (i, j), respectively. f (xi j ) is a transconductor with saturation and εkl is inter-pixel feedback gain coefficients as shown in Fig. 3. Here, the feedback coefficients should satisfy 0 < ε1 < ε2 < 1 and the self feedback should be zero for isotropic shrink operation. During the shrink operation, the current Ii j is generated by each pixel as Ii j = g(xi j ). Ii j is generated when the pixel is considered as a black pixel and it inversely proportional to xi j . The current Ii j are summed columnwise and row-wise as shown in Fig. 3, and denoted as Ii and I j where i and j are row number and column number, respectively. Since the black pixel (low xi j )
Single-Chip Eye Tracker Using Smart CMOS Image Sensor Pixels
133
Figure 5 shows the block diagram of WTAs, stop criterion circuit, address encoder and address latch. Each WTA detects continuously the column/row that has the highest summed current (Ii or I j ) during the shrink operation. The WTA generates winner location logic output A1,··· and only one node out of A1,··· is logic high that represent the winning column/row. The WTA circuit not only identifies the location of the winner, it selects the winning current, IWIN as well. The IWIN is used by stop criterion circuit to decide whether the shrink operation is completed. The address encoder continuously encodes the winning row/column during the shrink operation. If the wining current (IWIN ) is smaller than certain limit (ISTOP ) that corresponds to two black pixels, it means that the largest black region is completely shrunk down to 2 pixels that is located at the center of the largest black object. At this situation, the winner location logic output represents the location of the center of the remaining pixels. The stop criterion circuit generates logic high signal when any of horizontal or vertical winning current reaches to ISTOP and it causes that the address registers latch the winning address from the encoders. This latched address corresponds to the center of the pupil.
3.
Fig. 2.
Flow chart of the operational principle.
generates high Ii j , the summed currents represent that how many black pixels are remained in each column and row. Since each pixel gives positive feedback to its surrounding pixels, a white pixel forces its neighbor pixels to be white while a dark pixel provides small amount of feedback to its neighbor pixels. These inter-pixel feedback causes the boundary black pixels are changed to white continuously and the size of the black regions is shrunk as shown in Fig. 4. The nonlinear function f (xi j ) causes the gray image to be gradually changed to the black and white image. This shrink operation continues and eventually all pixels become white.
Circuits Implementation
The schematic diagram of the proposed smart CIS pixel is shown in Fig. 6(a). Figure 6(b) represents the timing diagram for one frame. First, the photodiode (PD) capacitor, C S is precharged to VREF1 and C F is reset during the reset phase by turning on TX and RST switch. Second, the TX switch is turned-off and C S is discharged corresponding to the incident light during the capture phase. In transfer phase, the charge in C S is transferred to C F by turning on the TX switch and turning off the RST switch. The C S is reset to VREF1 again at this moment. Once the PD charge is transferred into C F , xi j represents the light intensity (u i j ) of the pixel (i, j). The transistors MY generate feedback currents, εyi j corresponding to the pixel state voltage xi j . In shrink phases, the STR switch is turned-on and the current to the surrounding pixels are summed and integrated in C F causing xi j to be increased. Here, C F is chosen 4 times larger than C S because C S is range of several tens femto farad and the use of such a small capacitance for C F causes high sensitivity
134
Kim, Lim and Han
Fig. 3.
Fig. 4.
Block diagram of the smart CIS pixels array: interconnections are depicted only for the pixel (i, j).
System level simulation results of shrink operation: image size is 100 × 100, n is the iteration number, ε1 = 0.1 and ε2 = 0.15.
Single-Chip Eye Tracker Using Smart CMOS Image Sensor Pixels
Fig. 5.
Block diagram of WTA, address encoder and stop criterion.
Fig. 6.
135
to parasitic capacitance. This ratio causes xi j to be onefourth of u i j . Figure 7 shows implementation of f (xi j ) and g(xi j ) by careful arrangement of VREF1 and VR E F2 considering the operating region of MY and M S . Since the dynamic range of u i j is bounded as 0 < u i j < VREF1 and xi j = 1/4u i j , the VREF1 is chosen about 4/5VDD considering the dynamic range and the implementation of the saturation function f (xi j ). The VR E F2 is chosen so that MY is operated in cut-off when the xi j is low. The g(xi j ) is realized by M S ’s threshold voltage. M S is turned on when xi j is sufficiently low that is considered as a black pixel. Figure 8 presents a schematic diagram of the WTA [16, 17] and stop criterion circuit. The common source voltage, Vs is decided to turn-on only the winning transistor out of M1···H that has the highest input current and all others are turned off. The bias current, I B I AS
Block and timing diagram of the smart CIS pixel.
136
Kim, Lim and Han
Fig. 7.
Implementation of f (xi j ) and g(xi j ).
10/0.4
10/0.4
stop criterion
IWIN,H I1
IH
A1 2/2 2/2
2/0.4
2/0.4
2µA
AH
M1
ISTOP MH
2/2
VSTOP
2/2
IWIN,H VS
1/0.4
6µ A
1/0.4 1/0.4
IBIAS
12/4
Fig. 8.
Circuit diagram of the WTA circuit and stop criterion.
is chosen twice of maximum Ii j generated by one black pixel. The winner current, IWIN is equal to the highest input current and it is copied to stop criterion circuit. Since only winning transistor is turned on and allows current flow, only the winner generates logic high winning location logic output (A1···H ) and all others are low. The A1···H outputs are connected to the address encoder. Although the WTA deals with large number of inputs, the mismatch among transistors is not critical because eventually the WTA identifies the winner from two adjacent inputs that correspond to the remaining two black pixels in shrink operation. In the stop criterion circuit, the address latching signal, VSTOP is generated by comparing IWIN and ISTOP . The VSTOP is changed to logic high when IWIN is smaller than ISTOP .
4.
Simulation and Experimental Results
Figure 9 shows the transistor level simulation results for 32 × 32 array with test image shown in Fig. 9(a). Figure 9(b) shows the row-wise and column-wise summed currents (Ii s and I j s), the outputs of row and column WTAs (A1···V and A1···H ) and VSTOP . As the shrink operation is started at 55 µsec, Ii s and I j s are started to be decreased continuously as shown in Fig. 9(b). The currents that correspond to smaller circle located at (21, 22) are completely reduced to zero at 60 µsec. This means that the smaller circle is disappeared in the image. The logic output of the WTA changed to low as time elapses and only the winning row and column are kept logic high. When the current of the 11th row and 9th column that corresponds to the center of bigger circle is lower than ISTOP , the VSTOP is changed to
Single-Chip Eye Tracker Using Smart CMOS Image Sensor Pixels
Fig. 9.
Transistor level simulation results.
137
138
Kim, Lim and Han
Fig. 10.
Chip microphotograph of the fabricate eye tracker.
high causing the winning address to be latched at 67 µsec. The operation time of shrinking is about 12 µsec and it is decided automatically by the stop criterion circuit. The proposed single-chip eye tracker was fabricated with a 0.35-µm CMOS process. Figure 10 shows the microphotograph of the fabricated chip. The fabricated pixel array is 32 × 32 and one pixel occupies 50 × 50 µm2 . The photodiode is realized with n + / p − -substrate junction whose area is 10 × 10 µm2 . Figure 11 shows the testing setup for the error performance measurement with the fabricated prototype chip. The eye image captured under the infrared illumination is projected on the prototype chip through the beam projector and microscope. The light intensity of
Fig. 11. Testing environment for error measurement with the fabricated chip.
the image is decreased to realistic intensity level by optical attenuation filters. The projected image on the chip is monitored by a digital camera instantaneously. The image on the pixels array is aligned precisely with the image displayed on the computer. Figure 12 shows the microphotograph of the projected test images on the fabricated chip. The measurement results that were read out from the fabricated chip are marked as × and the expected results from the simulation are marked as . Figure 13 shows the measured error map of the fabricated eye tracker. The error of the fabricated eye tracker was tested with test images that Table 1.
Performance summary.
Process
0.35-µm CMOS 2-poly 4-metal
Power supply
3.3 V
Power consumption
260 mW
Chip size
1.95 × 1.95 mm2
Smart CIS cells array
32 × 32
Smart CIS cell size
50 × 50 µm2
Photo diode size
10 × 10 µm2
Quantum efficiency (QE)
0.19@1000 nm
PD signal range
1.9 V
Frame rate
125 fps
Integration time
1 msec
Error
±1 pixel
Single-Chip Eye Tracker Using Smart CMOS Image Sensor Pixels
Fig. 12.
Image on the prototype chip and measurement results: the measurement results marked as ×, the expected results marked as .
are similar to the test image shown in Fig. 9(a). The center of the black circle is changed pixel-by-pixel and the output address from the chip is compared with the ideal one. The test result shows that the error is within ±1 pixel. Error distribution in the upper-right side of the array is slightly higher than that of lower-left side of the array due to process variation. The frame rate is 125 frame-per-second (fps) and the power consumption is 260 mW with 3.3 V power supply. The specification of the prototype eye tracker is summarized in Table 1. 5.
139
Conclusion
This paper proposed a single-chip eye tracker using smart CIS pixels that does not require additional pe-
ripherals. The proposed system detects the center of the pupil using the shrink operation and WTA circuit. The shrink operation is performed by on-chip pixel level interaction. The prototype 32 × 32 eye tracker was designed and fabricated with 0.35-µm CMOS process. The prototype chip test results demonstrated that the proposed system could generate the address indicating the center of the pupil with ±1 pixel error at 125 fps rate. The pixel size and power consumption of the prototype eye tracker is still large and there is possibility to improve the pixel size further. The other problem encountered with the prototype chip is low QE at infrared light. This can be improved by implementing n− -well/p− -substrate diode instead of n+ /p− -substrate.
140
Kim, Lim and Han
6.
7.
8.
9.
10.
11.
Fig. 13. The measured error map of the fabricated eye tracker ( : no error : ±1 pixel error).
The proposed eye tracker is intended to be used as a pointing device in conjunction with a Head Mounted Display.
12. 13. 14.
Acknowledgment
15.
This work has been supported by the Ministry of Information & Communications, Korea, under the Information Technology Research Center Support Program.
16. 17.
Int. Conf. Industrial Electronics, Control, and Instrumentation, vol. 3, Nov. 1993, pp. 1718–1723. A. Graupner, J. Schreiter, S. Getzlaff, and R. Sch¨uffny, “CMOS image sensor with mixed-signal processor array.” IEEE J. SolidState Circuits, vol. 38, pp. 948–957, 2003. Y. Muramatsu, S. Kurosawa, M. Furumiya, H. Ohkubo, and Y. Nakashiba, “A signal-processing CMOS image sensor using a simple analog operation.” IEEE J. Solid-State Circuits, vol. 38, pp. 101–106, 2003. M. Schanz, W. Brochherde, R. Hauschild, B. J. Hosticka, and M. Schwarz, “Smart CMOS image sensor arrays.” IEEE Trans. Electron Devices, vol. 44, pp. 1699–1705, Oct. 1997. Y. Ni and J. Guan, “A 256 × 256 pixel smart CMOS image sensor for line-based stereo vision applications.” IEEE J. SolidState Circuits, vol. 35, pp. 1055–1061, 2000. S. Espejo, A. Rodr´ıguez-V´azquez, R. Dom´ınguez-Castro, J. L. Huertas, and E. S´anchez-Sinencio, “Smart-pixel cellular neural networks in analog current-mode CMOS technology.” IEEE J. Solid-State Circuits, vol. 29, pp. 895–905, 1994. M. Schwarz, R. Hauschild, B. J. Hosticka, J. Huppertz, T. Kneip, S. Kolnsberg, L. Ewe, and H. K. Trieu, “Single-chip CMOS image sensors for a retina implant system.” IEEE Trans. Circuits Syst. vol. 46, pp. 870–877, 1999. L. O. Chua and L. Yang, “Cellular neural networks: theory.” IEEE Trans. Circuits Syst., vol. 35, pp. 1257–1272, 1988. L. O. Chua and L. Yang, “Cellular neural networks: applications.” IEEE Trans. Circuits Syst., vol. 35, pp. 1273–1290, 1988. A. Fish, D. Turchin, and O. Yadid-Pecht, “An APS with 2D winner-take-all selection employing adaptive spatial filtering and false alarm reduction.” IEEE Trans. electron Devices, vol. 50, pp. 159–165, 2003. T. Serrano-Gotarredona and B. Linares-Barranco, “A highprecision current-mode WTA-MAX circuit with multichip capability.” IEEE J. Solid-State Circuits, vol. 33, pp. 280–286, 1998. I. E. Opris, “Analog rank extractors.” IEEE Trans. Circuits Syst., vol. 44, no. 12, pp. 1114–1121, 1997. J. Choi and B. J. Sheu, “A high-precision VLSI winner-take-all circuit for self-organizing neural networks.” IEEE J. Solid-State Circuits, vol. 28, no. 5, pp. 576–584, 1993.
References 1. T. Miyoshi and A. Murata, “Input device using eye tracker in human-computer interaction,” in Proc. 10th IEEE Int. Workshop Robot and Human Interactive Communication, Sept. 2001, pp. 18–21. 2. G. Beach, C.J. Cohen, J. Braun, and G. Moody, “Eye tracker system for use with head mounted displays,” in IEEE Int. Conf. Systems, Man, and Cybernetics, vol. 5, pp. 4348–4352, Oct. 1998. 3. K. Iwamoto, S. Katsumata, and K. Tanie, “An eye movement tracking type head mounted display for virtual reality system: evaluation experiments of a prototype system,” in IEEE Int. Conf. Systems, Man, and Cybernetics, vol. 1, pp. 13–18, Oct. 1994. 4. Z. Zhiwei, J. Qiang, K. Fujimura, and L. Kuangchih, “Combining Kalman filtering and mean shift for real time eye tracking under active IR illumination,” in Proc. 16th Int. Conf. Pattern Recognition, vol. 4, 2002, pp. 318–321. 5. T. Oya, H. Hashimoto, and F. Harashima, “Active eye sensing system-predictive filtering for visual tracking,” in Proc.
Dongsoo Kim was born in 1976. He received the B.S., and M.S. degrees in electrical and electronic engineering from the Yonsei University, Seoul, Korea in 2001 and 2004. He is currently working toward Ph.D. degree in Yonsei University. His research interest includes CMOS Image Sensor.
Single-Chip Eye Tracker Using Smart CMOS Image Sensor Pixels
Seunghyun Lim was born in 1976. He received the B.S. degrees in electrical and electronic engineering from the Yonsei University, Seoul, Korea in 2003. He is currently working toward M.S. degree in Yonsei University. His research interest includes CMOS Image Sensor.
141
Gunhee Han was born in 1965. He received the B.S. degree from Yonsei University, Seoul, Korea in 1990 and the PH.D. degree from Texas A&M University, College Station, in 1997. He was with Texas A&M till 1998. He is currently Associate professor in the Department of Electrical and Electronic Engineering at Yonsei University. His research interests include CMOS Image Sensor, high speed serial communication and - modulator.