This thesis presents new approaches on three essential techniques on facial image analysis:facial feature localization, face recognition, and facial expression recognition using localappearance descriptors. Such techniques have a large number of applications, including security,person verification, internet communication, and compute entertainment. Recent advances inautomated face detection and tracking, facial feature extraction, pattern recognition, and machinelearning have made it possible to develop automatic face analysis systems to address theseapplications. However, successful application under real-world conditions remains a challenge,since face images are subject to a wide range of variations. These include pose or view angle,illumination, occlusion, facial expression, time delay between image acquisition, and individualdifferences.To make the face image processing system be robust to deal with such practical conditions,this thesis develops novel methods on facial feature localization, face recognition, and facialexpression recognition, and tests the proposed methods on a large number of face images frompublicly available databases to verify the efficiency and superiority. This work does not followthe conventional approaches, for example, the subspace-based methods. Instead, efforts are madeto modify the local appearance descriptors, which are successfully used in object recognition inrecent years, and they are applied in facial feature localization and extraction. It is the firstattempt to systematically use local appearance descriptors in facial feature localization, facerecognition, and facial expression recognition at the same time.Two new algorithms based on statistical shape model and local appearance descriptors (SIFT descriptor and LGBP features) are proposed: the SIFT-ASM algorithm and the LGBP-ASMalgorithm. SIFT descriptor and LGBP features are originally introduced to describe the localappearance features around facial landmark points for the statistical shape model. Moreover,GentleBoost classifiers are used to train the local appearance features to avoid the least squareminimizations based on Gaussian assumption in the original statistical shape model method.Experimental results on more than 2000 face images in the XM2VTS and the Softpia Japandatabase show that the proposed methods improve the accuracy and robustness of facial featurelocalization significantly.For face recognition techniques, this work focus on solving the practical problems inreal-world face identification systems: face images under variant conditions of facial expressions,strong non-uniform lighting, partial occlusions, time delay between image acquisition, and so on.A block-based bag of words method is proposed for robust face recognition. This work appliesthe bag of words method to face recognition to extract discriminative local facial features for thefirst time. Moreover, the proposed method is able to provide holistic spatial information ofdifferent local facial regions at the same time. Only using one single frame with neutralexpression per person for training, the proposed method successfully deal with the difficultconditions mentioned above, and achieve the best face recognition performance ever on the ARdatabase. The average face recognition rates of the standard set and darkened set of theXM2VTS database also outperform other recent works.Facial expression recognition requires more subtle and discriminative facial feature extractioncomparing to face recognition. A novel framework of appearance and shape informationextraction is developed for facial expression recognition. Facial-component-based bag of wordsmethod is presented to extract local facial appearance changes while maintaining the holisticcharacteristics; similarly, facial-component-based PHOG descriptor is proposed to extract facelocal shape while enhancing the spatial information. Our method makes the bag of wordsmethods and local descriptors be possible to be used in facial expression recognition for the firsttime. The decision level fusion of the extracted appearance and shape information achieved theaverage recognition rate as 96.33% on the Cohn-Kanade database, which outperforms thestate-of-the-art research works.The thesis is organized as follows: in Chapter 1, the research background of face image analysis, especially the background of facial feature localization, face recognition, and facialexpression recognition is firstly overviewed. Then the research work of the thesis is introduced.In Chapter 2, a literature review on facial feature localization, specifically on the Active ShapeModel (ASM) method, and local appearance descriptors is given. Two novel algorithms based onASM using SIFT descriptor and LGBP features are presented in Chapter 3. GentleBoostclassifiers are also applied to replace the least square minimizations based on Gaussianassumption in the original ASM method. The two methods are tested on the XM2VTS andSoftpia Japan databases, and it is shown that the proposed methods significantly outperform theoriginal ASM method for facial feature localization. In Chapter 4, an overview on the previousworks on face recognition is firstly given, and the problems in practical face recognition systemsare analyzed. A block-base bag of words (BBoW) method for robust face recognition underdifferent real-world conditions is proposed. Experimental results on the AR and XM2VTSdatabases are given to show the efficiency and superiority of the proposed method. In Chapter 5,previous works on facial expression recognition are firstly introduced. Then the appearance andshape information extraction method is presented and the experimental results on theCohn-Kanade database are given to show that the proposed method outperforms thestate-of-the-art works. Finally, conclusions and future work of this thesis are given.
展开▼