Studies the problem of phonetic modeling for continuous Mandarin speech recognition by providing a systematic performance comparison for systems based on following primitive speech units: syllables, demi-syllables (initial and final), context-independent phones, left-or-right context-dependent phones (diphones) and left-and-right context-dependent phones (triphones). In our speaker-dependent continuous speech recognition experiments, a generalized triphone system has achieved the best performance of all. Our best system is in contrast to most other Mandarin speech recognition systems, which have been based on demi-syllable units.
展开▼