首页> 外文期刊>電子情報通信学会技術研究報告 >Efficient Discriminative Training of Error Corrective Models Using High-WER Competitors
【24h】

Efficient Discriminative Training of Error Corrective Models Using High-WER Competitors

机译:使用高WER竞争对手进行有效的误差校正模型的判别训练

获取原文
获取原文并翻译 | 示例
       

摘要

We focus on error corrective models for speech recognition, which select a more accurate word sequence in a word N-best list produced by a speech recognizer. In general, an error corrective model is trained so that the correct word sequence is discriminated by the model from all the sequences in each N-best list. However, we show that a more accurate model can be provided through the discrimination of only two word sequences; one is the correct word sequence and the other is the sequence with the highest word error rate (WER) in the list. We think there are two reasons for this effect: (1) many important error patterns can be obtained since typical errors frequently appear in the word sequences with high WERs, and (2) training with fewer word sequences alleviates the difficulty of parameter estimation. In addition, the model size becomes more compact and the training time is significantly reduced with our method. Experiments using the Corpus of Spontaneous Japanese (CSJ) show that our proposed method generates accurate and compact models, and specifically performs robustly for different tasks and linguistic features.
机译:我们专注于语音识别的纠错模型,该模型在语音识别器生成的单词N最佳列表中选择更准确的单词序列。通常,训练纠错模型,以便该模型从每个N个最佳列表中的所有序列中区分出正确的单词序列。但是,我们表明仅通过区分两个单词序列就可以提供更准确的模型。一个是正​​确的单词序列,另一个是列表中具有最高单词错误率(WER)的序列。我们认为造成这种影响的原因有两个:(1)由于典型错误经常出现在具有较高WER的单词序列中,因此可以获得许多重要的错误模式;(2)较少单词序列的训练减轻了参数估计的难度。此外,使用我们的方法,模型尺寸变得更紧凑,并且训练时间显着减少。使用自发日语语料库(CSJ)进行的实验表明,我们提出的方法可以生成准确而紧凑的模型,并且可以针对不同的任务和语言特征特别可靠地执行。

著录项

  • 来源
    《電子情報通信学会技術研究報告》 |2008年第551期|p.99-104|共6页
  • 作者单位

    NTT Communication Science Laboratories, NTT Corporation Hikaridai 2-4, Seika-cho, Soraku-gun, Kyoto, 619-0237 Japan;

    NTT Communication Science Laboratories, NTT Corporation Hikaridai 2-4, Seika-cho, Soraku-gun, Kyoto, 619-0237 Japan;

    NTT Communication Science Laboratories, NTT Corporation Hikaridai 2-4, Seika-cho, Soraku-gun, Kyoto, 619-0237 Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    error corrective model; high-wer competitors;

    机译:纠错模型;高端竞争者;
  • 入库时间 2022-08-18 00:37:15

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号