首页> 外文会议>European Conference on Speech Communication and Technology - EUROSPEECH 2003(INTERSPEECH 2003) vol.1; 20030901-04; Geneva(CH) >Using Untranscribed User Utterances for Improving Language Models based on Confidence Scoring
【24h】

Using Untranscribed User Utterances for Improving Language Models based on Confidence Scoring

机译:使用非转录用户言语基于置信度评分改进语言模型

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a method for reducing the effort of transcribing user utterances to develop language models for conversational speech recognition when a small number of transcribed and a large number of untranscribed utterances are available. The recognition hypotheses for untranscribed utterances are classified according to their confidence scores such that hypotheses with high confidence are used to enhance language model training. The utterances that receive low confidence can be scheduled to be manually transcribed first to improve the language model. The results of experiments using automatic transcription of the untranscribed user utterances show the proposed methods are effective in achieving improvements in recognition accuracy while reducing the effort required from manual transcription.
机译:本文提出了一种方法,当有少量转录和大量未转录的语音可用时,可以减少转录用户语音以开发用于会话语音识别的语言模型的工作量。未转录话语的识别假设根据其可信度得分进行分类,以便使用具有高可信度的假设来增强语言模型训练。可以将接收到低置信度的语音安排为先手动转录,以改善语言模型。使用未转录用户话语的自动转录进行的实验结果表明,所提出的方法可有效提高识别准确度,同时减少手动转录所需的工作量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号