Prosodic-syntactic boundary as an information source can be used to improve the performance of Large Vocabulary Continuous Speech Recognition (LVCSR) in both efficiency and accuracy. This paper presents a study of two effective methods to explit prosodic boundary information in a multi-pass decoder. In this paper, we address the effect of a language model on setting pruning beam width and how to control the Cross-word Context Dependent (CCD) models by prosodic boundary information. In the first pass decoding, dynamci beam search strategy regarding inner-word and cross-word paths is proposed to reduce search space efficiently, and then cross-word context dependent models are optimized using prosodic boundary information in the second pass decoding. The recognition experiments, which were carried out on the Japanese Newspaper Article Sentences (JNAS) 20k word task using a multi-pass decoder, demonstrated that the proposed method led to significant reduction in the search space with accuracy improvement.
展开▼