首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Bridging the Gap Between Monaural Speech Enhancement and Recognition With Distortion-Independent Acoustic Modeling
【24h】

Bridging the Gap Between Monaural Speech Enhancement and Recognition With Distortion-Independent Acoustic Modeling

机译:与无失真的声学建模桥接单声道语音增强与识别的差距

获取原文
获取原文并翻译 | 示例

摘要

Monaural speech enhancement has made dramatic advances since the introduction of deep learning a few years ago. Although enhanced speech has been demonstrated to have better intelligibility and quality for human listeners, feeding it directly to automatic speech recognition (ASR) systems trained with noisy speech has not produced expected improvements in ASR performance. The lack of an enhancement benefit on recognition, or the gap between monaural speech enhancement and recognition, is often attributed to speech distortions introduced in the enhancement process. In this article, we analyze the distortion problem, compare different acoustic models, and investigate a distortion-independent training scheme for monaural speech recognition. Experimental results suggest that distortion-independent acoustic modeling is able to overcome the distortion problem. Such an acoustic model can also work with speech enhancement models different from the one used during training. Moreover, the models investigated in this paper outperform the previous best system on the CHiME-2 corpus.
机译:自从几年前引入深度学习以来,单声道语音增强使得戏剧性的进步。虽然已经证明了增强的演讲,为人类听众提供了更好的可懂度和质量,但直接将其送入自动语音识别(ASR)系统培训的嘈杂语音训练的系统上没有产生ASR性能的预期改进。缺乏对识别的增强效益,或单声道语音增强和识别之间的差距往往归因于增强过程中引入的语音扭曲。在本文中,我们分析了失真问题,比较了不同的声学模型,并调查单个语音识别的失真训练方案。实验结果表明,无扭曲的声学建模能够克服失真问题。这种声学模型还可以使用与培训期间使用的语音增强模型不同。此外,本文调查的模型优于上述Chime-2语料库上的先前最佳系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号