Producing facial expressions is very important for realistically animating a virtual actor. It is difficult to make facial expressions interactively because of their subtlety. Therefore, realistic facial expressions of the virtual actor are often produced automatically by capturing the facial expressions of a puppeteer with a video camera and mapping them to the actor. Capturing facial expressions needs to not only extract feature points but also track head motion from the input image. Because the image is largely affected by lighting conditions and noises, an important issue is to improve the stability of capturing and its performance. This thesis describes the real-time capturing of facial expressions that tracks both the movements of facial feature points and head motion directly from the input image.
In CMYK and HSV color spaces, M and H values of lips are different from those of skin. Moreover, K and V values of eyes or eyebrows are also different from those of skin. Therefore, we transform the color space of the input image to that composed of M, K, H, V color values for easy facial feature segmentation. We use a contour model based on the Bezier curve of degree 3 that is a simple parameterized curve to represent the shape of a facial feature. This model fits the curve to the contour of the feature, which can robustly capture the shape of each facial feature even with a noisy image.
To track the head motion, we choose a number of points on the face fixed with respect to the evolution of facial expressions. We select every triple of the points in turn which are not colinear and track their 3D positions. It is well-known that the 3D positions of those points are not uniquely determined. We use the coherency of head motion to choose the correct solution. Finally, we combine the tracking results of all triples for robust head tracking.