首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Multi-speaker Sequence-to-sequence Speech Synthesis for Data Augmentation in Acoustic-to-word Speech Recognition
【24h】

Multi-speaker Sequence-to-sequence Speech Synthesis for Data Augmentation in Acoustic-to-word Speech Recognition

机译:多说话人序列语音合成技术在语音到语音识别中的数据增强

获取原文

摘要

The acoustic-to-word (A2W) automatic speech recognition (ASR) realizes very fast decoding with a simple architecture and achieves state-of-the-art performance. However, the A2W model suffers from the out-of-vocabulary (OOV) word problem and cannot use text-only data to improve the language modeling capability. Meanwhile, sequence-to-sequence neural speech synthesis has also been developed and achieved naturalness comparable to human speech. We investigate leveraging sequence-to-sequence neural speech synthesis to augment training data for the ASR system in a target domain. While speech synthesis model is usually trained with single speaker data, ASR needs to cover a variety of speakers. In this work, we extend the speech synthesizer so that it can output speech of many speakers. The multi-speaker speech synthesizer is trained with a large corpus in the source domain, then used to generate acoustic features from texts of the target domain. These synthesized speech features are combined with real speech features of the source domain to train an attention-based A2W model. Experimental results show that the A2W model trained with the multi-speaker model achieved a significant improvement over the baseline and the single speaker model.
机译:声音到单词(A2W)自动语音识别(ASR)以简单的体系结构实现非常快速的解码,并实现了最新的性能。但是,A2W模型存在语音不足(OOV)词的问题,并且不能使用纯文本数据来提高语言建模能力。同时,还开发了序列到序列的神经语音合成,并实现了与人类语音相当的自然性。我们调查利用序列到序列的神经语音合成来增加目标域中ASR系统的训练数据。虽然语音合成模型通常使用单个说话者数据进行训练,但ASR需要涵盖各种说话者。在这项工作中,我们扩展了语音合成器,使其可以输出许多扬声器的语音。在源域中使用大型语料库训练多扬声器语音合成器,然后将其用于从目标域的文本生成声学特征。这些合成的语音特征与源域的真实语音特征相结合,以训练基于注意力的A2W模型。实验结果表明,使用多扬声器模型训练的A2W模型相对于基准线和单扬声器模型而言取得了显着改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号