首页> 外文会议>European Signal Processing Conference >A curriculum learning method for improved noise robustness in automatic speech recognition
【24h】

A curriculum learning method for improved noise robustness in automatic speech recognition

机译:一种在语音自动识别中提高噪声鲁棒性的课程学习方法

获取原文

摘要

The performance of automatic speech recognition systems under noisy environments still leaves room for improvement. Speech enhancement or feature enhancement techniques for increasing noise robustness of these systems usually add components to the recognition system that need careful optimization. In this work, we propose the use of a relatively simple curriculum training strategy called accordion annealing (ACCAN). It uses a multi-stage training schedule where samples at signal-to-noise ratio (SNR) values as low as 0dB are first added and samples at increasing higher SNR values are gradually added up to an SNR value of 50dB. We also use a method called per-epoch noise mixing (PEM) that generates noisy training samples online during training and thus enables dynamically changing the SNR of our training data. Both the ACCAN and the PEM methods are evaluated on a end-to-end speech recognition pipeline on the Wall Street Journal corpus. ACCAN decreases the average word error rate (WER) on the 20dB to -10dB SNR range by up to 31.4% when compared to a conventional multi-condition training method.
机译:嘈杂环境下自动语音识别系统的性能仍然有待改进。用于增加这些系统的噪声鲁棒性的语音增强或特征增强技术通常向识别系统添加需要仔细优化的组件。在这项工作中,我们建议使用一种相对简单的课程培训策略,即手风琴退火(ACCAN)。它使用多阶段训练计划,其中首先添加信噪比(SNR)值低至0dB的样本,然后逐渐增加具有较高SNR值的样本,直至SNR值为50dB。我们还使用一种称为每时间段噪声混合(PEM)的方法,该方法在训练过程中在线生成嘈杂的训练样本,从而能够动态更改训练数据的SNR。 《华尔街日报》语料库的端到端语音识别管道上都评估了ACCAN和PEM方法。与传统的多条件训练方法相比,ACCAN可以在20dB至-10dB SNR范围内将平均字错误率(WER)降低多达31.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号