Effective Acoustic Modeling for Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition

机译：大型词汇会话语音识别中语速变化的有效声学建模

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We investigate several variants of speech-rate-dependent acoustic models for large-vocabulary conversational speech recognition, in the framework of combining rate-specific models in decoding to compensate for speech rate variation. We study two basic approaches to combining rate-specific models: one combines models at the pronunciation level and the other at the HMM state level. Furthermore, we investigate the influence of different numbers of rate-of-speech classes and different parameter tying schemes. Experiments on the Switchboard database, using SRI's DECIPHER recognition system, show that rate-dependent acoustic modeling resulted in a 2% relative word error rate reduction over a rate-independent baseline, and that the pronunciation-level constraint, Gaussian sharing between rate-specific models, and a well-chosen number of rate-of-speech classes are all important for best performance.

机译：我们在结合速率特定模型进行解码以补偿语音速率变化的框架下，研究了针对大词汇量会话语音识别的语音速率相关声学模型的几种变体。我们研究了两种结合速率特定模型的基本方法：一种是在语音级别结合模型，另一种是在HMM状态级别结合模型。此外，我们研究了不同数量的语音速率类和不同的参数绑定方案的影响。使用SRI的DECIPHER识别系统在Switchboard数据库上进行的实验表明，与速率无关的声学模型导致在与速率无关的基准上相对词错误率降低了2％，并且语音级别约束，特定于速率之间的高斯共享模型，以及良好的语速等级选择对于确保最佳性能都非常重要。

著录项

来源
《International Conference on Spoken Language Processing; 20041004-08; Jeju(KR)》|2004年|P.401-404|共4页
会议地点 Jeju(KR)
作者
Jing Zheng; Horatio Franco; Andreas Stolcke;
展开▼
作者单位

Speech Technology and Research Lab, SRI International 333 Ravenswood Ave., Menlo Park, CA 94025;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类应用语言学;
关键词

相似文献

外文文献
中文文献
专利

1. Modeling word-level rate-of-speech variation in large vocabulary conversational speech recognition [J] . Jing Zheng, Horacio Franco, Andreas Stolcke Speech Communication . 2003,第2a3期

机译：大型词汇会话语音识别中的词级语音变化率建模
2. Automatic determination of acoustic model topology using variational Bayesian estimation and clustering for large vocabulary continuous speech recognition [J] . Watanabe S., Sako A., Nakamura A. IEEE transactions on audio, speech and language processing . 2006,第3期

机译：基于变分贝叶斯估计和聚类的大词汇量连续语音识别自动确定声学模型拓扑
3. Random Forests of Phonetic Decision Trees for Acoustic Modeling in Conversational Speech Recognition [J] . Xue J., Zhao Y. IEEE transactions on audio, speech and language processing . 2008,第3期

机译：会话语音识别中语音建模的语音决策树随机森林
4. Effective Acoustic Modeling for Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition [C] . Jing Zheng, Horatio Franco, Andreas Stolcke, International Conference on Spoken Language Processing . 2004

机译：大词汇对话语音识别中的语音率速率有效声学模型
5. Statistical optimization of acoustic models for large vocabulary speech recognition [D] . Hu, Rusheng 2006

机译：用于大词汇量语音识别的声学模型的统计优化
6. Words from spontaneous conversational speech can be recognized with human-like accuracy by an error-driven learning algorithm that discriminates between meanings straight from smart acoustic features bypassing the phoneme as recognition unit [O] . Denis Arnold, Fabian Tomaschek, Konstantin Sering, -1

机译：通过错误驱动的学习算法可以区分自发会话语音中的单词其准确性与人类类似可以从智能声学特征中区分出含义而绕过音素作为识别单元
7. Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition [O] . Ali Yazgan, Murat Saraclar 2004

机译：用于大词汇量会话语音识别中词汇外单词检测的混合语言模型

Effective Acoustic Modeling for Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅