Using Untranscribed User Utterances for Improving Language Models based on Confidence Scoring

机译：使用非转录用户言语基于置信度评分改进语言模型

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a method for reducing the effort of transcribing user utterances to develop language models for conversational speech recognition when a small number of transcribed and a large number of untranscribed utterances are available. The recognition hypotheses for untranscribed utterances are classified according to their confidence scores such that hypotheses with high confidence are used to enhance language model training. The utterances that receive low confidence can be scheduled to be manually transcribed first to improve the language model. The results of experiments using automatic transcription of the untranscribed user utterances show the proposed methods are effective in achieving improvements in recognition accuracy while reducing the effort required from manual transcription.

机译：本文提出了一种方法，当有少量转录和大量未转录的语音可用时，可以减少转录用户语音以开发用于会话语音识别的语言模型的工作量。未转录话语的识别假设根据其可信度得分进行分类，以便使用具有高可信度的假设来增强语言模型训练。可以将接收到低置信度的语音安排为先手动转录，以改善语言模型。使用未转录用户话语的自动转录进行的实验结果表明，所提出的方法可有效提高识别准确度，同时减少手动转录所需的工作量。

著录项

来源
《European Conference on Speech Communication and Technology - EUROSPEECH 2003(INTERSPEECH 2003) vol.1; 20030901-04; Geneva(CH)》|2003年|P.417-420|共4页
会议地点 Geneva(CH)
作者
Mikio Nakano; Timothy J. Hazen;
展开▼
作者单位

NTT Communication Science Laboratories NTT Corporation 3-1 Morinosato-Wakamiya Atsugi, Kanagawa 243-0198, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动信息理论;
关键词

相似文献

外文文献
中文文献
专利

1. Speaker identification based on Gaussian mixture model - experiments with Polish language utterances [J] . ADAM DA.BROWSKI, SZYMON DRGAS, DAMIAN CETNAROWICZ, Elektronika . 2008,第4期

机译：基于高斯混合模型的说话人识别-波兰语发音实验
2. Modelling self-confidence in users of a computer-based system showing unrepresentative design [J] . Briggs P., Dracup C., Burford B. International journal of human-computer studies . 1998,第5期

机译：对基于计算机的系统的用户进行建模以显示无代表性的设计
3. Post-dialogue confidence scoring for unsupervised statistical language model training [J] . Sudoh K, Nakano M Speech Communication . 2005,第4期

机译：对话后置信度评分，用于无监督统计语言模型训练
4. Using Untranscribed User Utterances for Improving Language Models based on Confidence Scoring [C] . Mikio Nakano, Timothy J. Hazen, International Speech Communication Association(ISCA) European Conference on Speech Communication and Technology . 2003

机译：利用未经筛查的用户话语来改善基于信心评分的语言模型
5. Will the Introduction of a Critical Questioning Technique and the Toulmin Model Improve the Argumentative Essay Writing Scores of Students in an Eighth Grade English Language Arts Class? [D] . Zimmerbaum, Kate Z. 2014

机译：引入批判性质疑技术和Toulmin模型将改善八级英语艺术类学生的争论论文写作分数吗？
6. Fusion with Language Models Improves Spelling Accuracy for ERP-based Brain Computer Interface Spellers [O] . Umut Orhan, Deniz Erdogmus, Brian Roark, -1

机译：融合与语言模型提高了基于ERp的脑机接口参赛者拼写准确度
7. Improving Word Alignment with Language Model Based Confidence Scores [O] . Nguyen Bach, Qin Gao, Stephan Vogel 2009

机译：使用基于语言模型的置信度分数改善单词对齐

Using Untranscribed User Utterances for Improving Language Models based on Confidence Scoring

摘要

著录项

相似文献

相关主题

期刊订阅