首页> 外文会议>IEEE Automatic Speech Recognition and Understanding Workshop >End-to-End Training of a Large Vocabulary End-to-End Speech Recognition System
【24h】

End-to-End Training of a Large Vocabulary End-to-End Speech Recognition System

机译:大型词汇端到端语音识别系统的端到端培训

获取原文

摘要

In this paper, we present an end-to-end training framework for building state-of-the-art end-to-end speech recognition systems. Our training system utilizes a cluster of Central Processing Units (CPUs) and Graphics Processing Units (GPUs). The entire data reading, large scale data augmentation, neural network parameter updates are all performed “on-the-fly”. We use vocal tract length perturbation [1] and an acoustic simulator [2] for data augmentation. The processed features and labels are sent to the GPU cluster. The Horovod allreduce approach is employed to train neural network parameters. We evaluated the effectiveness of our system on the standard Librispeech corpus [3] and the 10,000-hr anonymized Bixby English dataset. Our end-to-end speech recognition system built using this training infrastructure showed a 2.44 % WER on test-clean of the LibriSpeech test set after applying shallow fusion with a Transformer language model (LM). For the proprietary English Bixby open domain test set, we obtained a WER of 7.92 % using a Bidirectional Full Attention (BFA) end-to-end model after applying shallow fusion with an RNN-LM. When the monotonic chunckwise attention (MoCha) based approach is employed for streaming speech recognition, we obtained a WER of 9.95 % on the same Bixby open domain test set.
机译:在本文中,我们提出了一个端到端的培训框架,用于构建最先进的端到端语音识别系统。我们的培训系统利用了中央处理器(CPU)和图形处理器(GPU)的集群。整个数据读取,大规模数据扩充,神经网络参数更新均“即时”执行。我们使用声道长度扰动[1]和声学模拟器[2]进行数据增强。处理后的功能和标签将发送到GPU集群。 Horovod allreduce方法用于训练神经网络参数。我们评估了标准Librispeech语料库[3]和10,000小时匿名的Bixby English数据集对系统的有效性。我们使用此培训基础结构构建的端到端语音识别系统在将浅层融合与Transformer语言模型(LM)结合使用后,在LibriSpeech测试集的测试清洁度上显示出2.44%的WER。对于专有的英语Bixby开放域测试集,在使用RNN-LM进行浅层融合后,我们使用双向全关注(BFA)端到端模型获得了7.92%的WER。当基于单调逐块注意(MoCha)的方法用于流语音识别时,我们在相同的Bixby开放域测试集上获得了9.95%的WER。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号