首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Automatic transcription of conversational telephone speech
【24h】

Automatic transcription of conversational telephone speech

机译:对话电话语音的自动转录

获取原文
获取原文并翻译 | 示例
       

摘要

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system.
机译:本文讨论了用于对话电话语音自动转录的剑桥大学HTK(CU-HTK)系统。详细介绍了前端处理,声学建模和模型训练,语言和发音建模中最重要的技术。这些包括使用基于对话侧的倒谱归一化,声道长度归一化,异方差线性判别分析进行特征投影,最小电话错误训练和说话者自适应训练,基于格的模型自适应,基于混淆网络的解码和置信度估计,发音选择,语言模型插值和基于类的语言模型。详细介绍了为参与2002年NIST丰富的英语会话电话语音数据评估而开发的转录系统。在此评估中,CU-HTK系统的总字错误率为23.9%,这是统计学上显着的最佳性能。在2002 CU-HTK 10×RT对话语音转录系统的背景下,讨论了有关具有较快性能降低的较快系统的进一步详细信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号