We propose a model of phoneme-based speech unit, called semi-continuous stochastic trajectory model (SC-STM), which generalizes our stochastic trajectory models (STM). As STMs, the SC-STMs focus on the modeling of speech segments (called trajectories) in their parameter space, and can therefore handle segmental information, which is critical for large vocabulary continuous speech recognition. Compared to the STMs, the SC-STMs improve the resolution of the trajectory modeling, while keeping a moderate number of free parameters by sharing state probability density functions. The SC-STM can therefore maintain a good trade-off between detailed acoustic modeling and limited training data. We tested the idea on a 2010 words, speaker-dependent, continuous speech database. Preliminary results show that SC-STM gives a word accuracy close to that of STM, without using heuristic techniques that enhanced STM.
展开▼