首页> 外文会议>IEEE International Work Conference on Bioinspired Intelligence >Pre-training Long Short-term Memory Neural Networks for Efficient Regression in Artificial Speech Postfiltering
【24h】

Pre-training Long Short-term Memory Neural Networks for Efficient Regression in Artificial Speech Postfiltering

机译:预训练长短短期记忆神经网络,用于人工语音后的高效回归

获取原文
获取外文期刊封面目录资料

摘要

Several attempts to enhance statistical parametric speech synthesis have contemplated deep-learning-based postfilters, which learn to perform a mapping of the synthetic speech parameters to the natural ones, reducing the gap between them. In this paper, we introduce a new pre-training approach for neural networks, applied in LSTM-based postfilters for speech synthesis, with the objective of enhancing the quality of the synthesized speech in a more efficient manner. Our approach begins with an auto-regressive training of one LSTM network, whose is used as an initialization for postfilters based on a denoising autoencoder architecture. We show the advantages of this initialization on a set of multi-stream postfilters, which encompass a collection of denoising autoencoders for the set of MFCC and fundamental frequency parameters of the artificial voice. Results show that the initialization succeeds in lowering the training time of the LSTM networks and achieves better results in enhancing the statistical parametric speech in most cases, when compared to the common random-initialized approach of the networks.
机译:提高统计参数致辞合成的几次尝试已经考虑了基于深度学习的后滤波器,其学习对自然来执行合成语音参数的映射,从而降低它们之间的间隙。在本文中,我们介绍了一种新的神经网络预训练方法,适用于基于LSTM的后馒头进行语音合成,其目的是以更有效的方式提高合成语音的质量。我们的方法始于一个LSTM网络的自动回归培训,其被用作基于去噪的自动统计器架构的Postfilters初始化。我们展示了一组多流后馒头对该初始化的优势,它包括用于该组MFCC和人工音的基频参数的去噪自动化器。结果表明,初始化成功降低了LSTM网络的培训时间,并在大多数情况下,在大多数情况下,在大多数情况下增强统计参数语音的培训时间,与网络的公共随机初始化方法相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号