Facial pose and gaze point are fundamental to any visually directed human-machine interface. In this paper we propose a system capable of tracking a face and estimating the 3-D pose and the gaze point all in a real-time video stream of the head. This is done by using a 3-D model together with multiple triplet triangulation of feature positions assuming an affine projection. Using feature-based tracking the calculation of a 3-D eye gaze direction vector is possible even with head rotation and using a monocular camera. The system is also able to automatically initialise the feature tracking and to recover from total tracking failures which can occur when a person becomes occluded or temporarily leaves the image.
展开▼