首页> 外文会议>European Signal Processing Conference >A curriculum learning method for improved noise robustness in automatic speech recognition

【24h】

A curriculum learning method for improved noise robustness in automatic speech recognition

机译：一种在语音自动识别中提高噪声鲁棒性的课程学习方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The performance of automatic speech recognition systems under noisy environments still leaves room for improvement. Speech enhancement or feature enhancement techniques for increasing noise robustness of these systems usually add components to the recognition system that need careful optimization. In this work, we propose the use of a relatively simple curriculum training strategy called accordion annealing (ACCAN). It uses a multi-stage training schedule where samples at signal-to-noise ratio (SNR) values as low as 0dB are first added and samples at increasing higher SNR values are gradually added up to an SNR value of 50dB. We also use a method called per-epoch noise mixing (PEM) that generates noisy training samples online during training and thus enables dynamically changing the SNR of our training data. Both the ACCAN and the PEM methods are evaluated on a end-to-end speech recognition pipeline on the Wall Street Journal corpus. ACCAN decreases the average word error rate (WER) on the 20dB to -10dB SNR range by up to 31.4% when compared to a conventional multi-condition training method.

机译：嘈杂环境下自动语音识别系统的性能仍然有待改进。用于增加这些系统的噪声鲁棒性的语音增强或特征增强技术通常向识别系统添加需要仔细优化的组件。在这项工作中，我们建议使用一种相对简单的课程培训策略，即手风琴退火（ACCAN）。它使用多阶段训练计划，其中首先添加信噪比（SNR）值低至0dB的样本，然后逐渐增加具有较高SNR值的样本，直至SNR值为50dB。我们还使用一种称为每时间段噪声混合（PEM）的方法，该方法在训练过程中在线生成嘈杂的训练样本，从而能够动态更改训练数据的SNR。《华尔街日报》语料库的端到端语音识别管道上都评估了ACCAN和PEM方法。与传统的多条件训练方法相比，ACCAN可以在20dB至-10dB SNR范围内将平均字错误率（WER）降低多达31.4％。

著录项

来源
《European Signal Processing Conference 》|2017年|548-552|共5页
会议地点
作者
Stefan Braun; Daniel Neil; Shih-Chii Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Signal to noise ratio; Noise robustness; Training data; Noise measurement; Feature extraction; Neural networks;

机译：训练;信噪比;噪声鲁棒性;训练数据;噪声测量;特征提取;神经网络;

相似文献

外文文献
中文文献
专利

1. A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition [J] . Kentaro Ishizuka, Tomohiro Nakatani Speech Communication . 2006 ,第11期

机译：一种基于子带的周期性和非周期性分解与噪声鲁棒前端处理的特征提取方法，用于自动语音识别
2. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019 ,第5期

机译：基于多通道NMF信息波束形成的无监督语音增强技术，用于强噪声自动语音识别
3. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019 ,第5期

机译：基于多通道NMF的噪声强度自动语音识别的无监督语音增强
4. A curriculum learning method for improved noise robustness in automatic speech recognition [C] . Stefan Braun, Daniel Neil, Shih-Chii Liu European Signal Processing Conference . 2017

机译：一种改进自动语音识别噪声鲁棒性的课程学习方法
5. Compressive nonlinearity for representing speech spectral magnitude to improve noise robustness of automatic speech recognition . [D] . Wong, Brian. 2011

机译：压缩非线性表示语音频谱幅度提高语音自动识别的鲁棒性。
6. Speech Perception for Adult Cochlear Implant Recipients in a Realistic Background Noise: Effectiveness of Preprocessing Strategies and External Options for Improving Speech Recognition in Noise [O] . René H. Gifford, Lawrence J. Revit -1

机译：成人耳蜗植入者在现实背景噪声中的言语感知：预处理策略和外部选择改善噪声语音识别的有效性
7. A curriculum learning method for improved noise robustness in automatic speech recognition [O] . Stefan Braun, Daniel Neil, Shih-Chii Liu 2017

机译：一种改进自动语音识别噪声鲁棒性的课程学习方法

A curriculum learning method for improved noise robustness in automatic speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅