首页> 外文会议>European Signal Processing Conference >Selective Adaptation of End-to-End Speech Recognition using Hybrid CTC/Attention Architecture for Noise Robustness
【24h】

Selective Adaptation of End-to-End Speech Recognition using Hybrid CTC/Attention Architecture for Noise Robustness

机译:使用混合CTC /关注架构的结束语音识别选择性适应噪声鲁棒性的结束语音识别

获取原文

摘要

This paper investigates supervised adaptation of end-to-end speech recognition, which uses hybrid connectionist temporal classification (CTC)/Attention architecture, for noise robustness. The components of the architecture, namely the shared encoder, the attention decoder’s long short-term memory (LSTM) layers, and the soft-max layers of the CTC part and attention part, are adapted separately or together using limited amount of adaptation data. When adapting the shared encoder, we propose to adapt only the connections of the memory cells in the memory blocks of bidirectional LSTM (BLSTM) layers to improve performance and reduce the time for adapting the models. In within-domain and cross-domain adaptation scenarios, experimental results show that adaptation of end-to-end speech recognition using the hybrid CTC/Attention architecture is effective even when the amount of adaptation data is limited. In cross-domain adaptation, substantial performance improvement can be achieved with only 2.4 minutes of adaptation data. In both adaptation scenarios, adapting only the memory cells of the BLSTM layers in the shared encoder yields comparable or slightly better performance while yielding smaller adaptation time than the adaptation of other components or the whole architecture, especially when the amount of adaptation data is less than or equal to 10 minutes.
机译:本文调查了对端到端语音识别的监督适应,它使用混合连接主义时间分类(CTC)/注意架构,用于抗噪声鲁棒性。架构的组件,即共享编码器,注意解码器的长短短期存储器(LSTM)层以及CTC部分和注意部分的软最大层,使用有限量的适配数据分别地或一起进行调整。当适应共享编码器时,我们建议仅适应BIDirectional LSTM(BLSTM)层的存储器块中的存储器单元的连接,以提高性能并减少适应模型的时间。在域内和跨域适应方案中,实验结果表明,即使当适应数据的量有限时,使用混合CTC /注意架构的端到端语音识别的适应是有效的。在交叉域适应中,只需2.4分钟的适应数据即可实现实质性的性能改进。在两个适应方案中,仅适应共享编码器中的BLSTM层的存储器单元产生相当或稍好的性能,同时产生比其他组件或整个架构的适应性更小的适应时间,尤其是当适应数据的量小于时或等于10分钟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号