A great number of experimental results in Chinese speech recognition show that the performance of a recognizer can be improved remarkably if the possible impact of the tone on speech recognition is brought into consideration. It is self-evident that the tone can be perceived independent of syllables in the utterances of different syllables with the same tone pattern, and likewise, in the utterances of the same syllable with different tone patterns the syllable can be perceived separately as well. Accordingly the speech features of the syllable should be able to be separated from that of the tone because of their independence in auditory perception and phonetics. Virtually it is extremely difficult to extract the corresponding speech features out of speech signals using the existed algorithms. It results in that the speech representations commonly used in Chinese speech recognition contain the speech features of both syllables and tones. Consequently, models of new structure are required for Chinese speech recognition in which speech features of the syllable and the tone in a syllable with some tone pattern should be modeled by two unique models. One for syllable, the other for tone, and these models should intercouple. In this paper on the basis of the fundamental framework of HMM's the authors depicted the relations between syllable HMM's and tone ones by introducing the intercoupling probabilities and originally presented a kind of two-level intercoupling HMM's. Baum-Welch algorithm was developed to reestimate their parameters. In order to reduce the amount of computation in speech recognition several approaches were suggested to simplify recognition schemes. The proposed approach increased recognition accuracy from 86.7% to 92.2% on the training set and from 82.9% to 87.3% on an independent set of test data.
展开▼