We present an analysis/synthesis coding scheme for visual telephony, based on a wire-frame and simulated-muscle model of the human head. Anatomically accurate muscles control facial synthesis from a stored texture map, providing economical representation of human facial expressions. Our main contribu- tion is the development of a steepest-descent analysis algorithm that accurately and robustly tracks head movement and facial expression in a moving sequence, in terms of muscle parameters. Coding of the head part of the "Miss America" sequence is achieved at below 1000 bits/frame, with most of the data allocated to texture updates for the eyes and mouth.
展开▼