首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Utterance-Level Sequential Modeling for Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
【24h】

Utterance-Level Sequential Modeling for Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit

机译:基于简单递归单元的基于深高斯过程的语音合成的话语级顺序建模

获取原文
获取外文期刊封面目录资料

摘要

This paper presents a deep Gaussian process (DGP) model with a recurrent architecture for speech sequence modeling. DGP is a Bayesian deep model that can be trained effectively with the consideration of model complexity and is a kernel regression model that can have high expressibility. In the previous studies, it was shown that the DGP-based speech synthesis outperformed neural network-based one, in which both models used a feed-forward architecture. To improve the naturalness of synthetic speech, in this paper, we show that DGP can be applied to utterance-level modeling using recurrent architecture models. We adopt a simple recurrent unit (SRU) for the proposed model to achieve a recurrent architecture, in which we can execute fast speech parameter generation by using the high parallelization nature of SRU. The objective and subjective evaluation results show that the proposed SRU-DGP-based speech synthesis outperforms not only feed-forward DGP but also automatically tuned SRU- and long short-term memory (LSTM)-based neural networks.
机译:本文提出了一种具有递归体系结构的深度高斯过程(DGP)模型,用于语音序列建模。 DGP是可以考虑模型复杂性而有效地进行训练的贝叶斯深度模型,并且是可以具有高可表达性的内核回归模型。在以前的研究中,表明基于DGP的语音合成优于基于神经网络的语音合成,在这两种模型中,前者均使用前馈体系结构。为了提高合成语音的自然性,在本文中,我们证明了DGP可以用于使用递归体系结构模型的话语级建模。对于所提出的模型,我们采用简单的递归单元(SRU)以实现递归体系结构,其中我们可以利用SRU的高并行性来执行快速语音参数生成。客观和主观的评估结果表明,所提出的基于SRU-DGP的语音合成不仅优于前馈DGP,而且还能自动优化基于SRU和长短期记忆(LSTM)的神经网络。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号