...
首页> 外文期刊>Pattern recognition letters >Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout
【24h】

Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout

机译:使用自回归移动平均频谱图特征和通道丢失来提高CNN声学模型的鲁棒性

获取原文
获取原文并翻译 | 示例

摘要

Developing automatic speech recognition systems that are robust to mismatched and noisy channel conditions is a challenging problem, especially when the training and the test conditions are different. Here, we seek to increase the robustness of convolutional neural network (CNN) acoustic models under such circumstances by combining two methods. Firstly, we propose an improved version of input dropout, which exploits the special structure of the input time-frequency representation. Instead of just dropping out random 'pixels' of the spectrogram, the proposed channel dropout approach discards whole spectral channels. We expect that this dropout strategy will force the network to rely less on the whole spectrum, and make it more robust to channel mismatches and narrow-band noise. Secondly, we replaced the standard mel-spectrogram input representation with the autoregressive moving average (ARMA) spectrogram, which was recently shown to outperform the former under mismatched train-test conditions. In our experiments on the Aurora-4 database, the proposed channel dropout method attained relative word error rate reductions of 16% with ARMA features (an absolute improvement of 3%), and 20% with FBANK features (an absolute improvement of 7%) over the baseline CNN, when using the clean training scenario. (C) 2017 Elsevier B.V. All rights reserved.
机译:开发对不匹配和嘈杂的信道条件具有鲁棒性的自动语音识别系统是一个具有挑战性的问题,尤其是在训练和测试条件不同的情况下。在这里,我们试图通过结合两种方法来提高卷积神经网络(CNN)声学模型在这种情况下的鲁棒性。首先,我们提出了输入缺失的改进版本,它利用了输入时频表示的特殊结构。提出的通道丢失方法不仅丢弃了频谱图的随机“像素”,还丢弃了整个光谱通道。我们预计,这种丢包策略将迫使网络减少对整个频谱的依赖,并使它对信道不匹配和窄带噪声更加健壮。其次,我们用自回归移动平均(ARMA)谱图代替了标准的梅尔谱图输入表示法,最近证明,在不匹配的列车测试条件下,该谱图的性能优于前者。在我们对Aurora-4数据库进行的实验中,提出的通道丢失方法在ARMA功能下的相对字错误率降低了16%(绝对提高了3%),在FBANK功能下的相对字错误率降低了20%(绝对提高了7%)。使用干净培训方案时,超出基线CNN。 (C)2017 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号