In this paper, we focus on developing new techniques in model-based reconstruction of a human face which can be animated in real time using both video and audio inputs. We mainly deal with three issues in this paper: (1) a new semiautomatic method to reconstruct 3D facial model from three pictures ( one front and two side views, the front and each side view are orthogonal ) using our Contractive Deformation Model; (2) a new multi-directional technique for texture mapping; (3) the animation of the new constructed face using both video and audio inputs. Our system is part of the effort in developing STODE, an embodied conversational agent for Chinese speech training for children with prelingual deafness together with Nanjing Oral School under grants from National Natural Science Foundation of China. It allows integration of audio and video inputs and produces a synchronized visual and acoustic output in real time to help the deaf children overcome two major difficulties in speech learning: the confusion of different phonemes and timing within different words and syllables.
展开▼