An endpoint detection algorithm is utilised for segmentation of audio video Malay utterances. An audio visual Malay speech database of subjects uttering numerical digits is used. Synchronization between video frames and audio signals is taken into considerations for audio visual speech processing. The proposed system is able to group together the individual syllables that make up each of the uttered Malay digits.
展开▼