【24h】

Chinese Speech Synthesis System Based on End to End

机译:基于端到端的中文语音合成系统

获取原文

摘要

This paper describes an end-to-end Chinese speech synthesis scheme, a neural network architecture for speech synthesis directly from text, which is based on the improvement of tacotron2. In order to improve the original seq2seq model, multi attention mechanism is used to replace the original position-based attention mechanism to improve the sound quality of synthetic speech. The prosodic style coding module is added to better show the prosodic and stylistic characteristics of the synthesized speech. Then an LpcNet vocoder is used to replace the original WaveNet to generate time-domain waveform, which improves the synthesis efficiency and reduces the synthesis time. The experiments show that the speech synthesis system is effective and timely. The synthesized speech is natural and has good prosodic style.
机译:本文基于tacotron2的改进,描述了一种端到端中文语音合成方案,这是一种直接从文本进行语音合成的神经网络体系结构。为了改善原始seq2seq模型,使用多注意机制代替了基于位置的原始注意机制,以提高合成语音的音质。添加了韵律样式编码模块,以更好地显示合成语音的韵律和风格特征。然后使用LpcNet声码器代替原始WaveNet生成时域波形,从而提高了合成效率并减少了合成时间。实验表明,语音合成系统是有效,及时的。合成的语音是自然的并且具有良好的韵律风格。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号