首页> 外文期刊>IEICE Transactions on Information and Systems >Incremental Language Modeling for Automatic Transcription of Broadcast News
【24h】

Incremental Language Modeling for Automatic Transcription of Broadcast News

机译:广播新闻自动转录的增量语言建模

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we address the task of incremental language modeling for automatic transcription of broadcast news speech. Daily broadcast news naturally contains new words that are not in the lexicon of the speech recognition system but are important for downstream applications such as information retrieval or machine translation. To recognize those new words, the lexicon and the language model of the speech recog-nition system need to be updated periodically. We propose a method of estimating a list of words to be added to the lexicon based on some time-series text data. The experimental results on the RT04 Broadcast News data and other TV audio data showed that this method provided an impressive and stable reduction in both out-of-vocabulary rates and speech recognition word error rates.
机译:在本文中,我们解决了用于广播新闻语音自动转录的增量语言建模的任务。每日广播新闻自然包含新词,这些词不在语音识别系统的词典中,但对于下游应用程序(如信息检索或机器翻译)很重要。为了识别这些新单词,语音识别系统的词典和语言模型需要定期更新。我们提出了一种基于一些时间序列文本数据来估计要添加到词典中的单词列表的方法。在RT04广播新闻数据和其他电视音频数据上的实验结果表明,该方法可显着稳定地降低出语音率和语音识别单词错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号