A two-stage recognition scheme, phonetic recognition followed by prosodic recognition is established. In the phonetic recognition process, 21 initial and 37 final context-independent HMMs are used to construct the phonetic recognizer. In the prosodic recognizer, 175 context-dependent prosodic HMMs are used to model the complicated tone behavior for all possible tone concatenations. Five anti-prosodic HMMs, each corresponding to one lexical tone, are constructed to enhance the discrimination among prosodic HMMs. This system was evaluated in a speaker-dependent mode on a vocabulary size of thirty thousand words. The experimental results show that the recognition rate was improved from 80.3% to 86.7% using the prosodic information.
展开▼