首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis
【24h】

Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis

机译:通过神经网络直接建模语音波形以进行统计参数语音合成

获取原文

摘要

This paper proposes a novel approach for directly-modeling speech at the waveform level using a neural network. This approach uses the neural network-based statistical parametric speech synthesis framework with a specially designed output layer. As acoustic feature extraction is integrated to acoustic model training, it can overcome the limitations of conventional approaches, such as two-step (feature extraction and acoustic modeling) optimization, use of spectra rather than waveforms as targets, use of overlapping and shifting frames as unit, and fixed decision tree structure. Experimental results show that the proposed approach can directly maximize the likelihood defined at the waveform domain.
机译:本文提出了一种使用神经网络在波形水平上直接建模语音的新颖方法。这种方法使用了基于神经网络的统计参数语音合成框架,该框架具有经过特殊设计的输出层。由于声学特征提取已集成到声学模型训练中,因此它可以克服常规方法的局限性,例如两步(特征提取和声学建模)优化,使用频谱而不是波形作为目标,使用重叠和移动帧作为目标单元和固定的决策树结构。实验结果表明,所提出的方法可以直接最大化在波形域上定义的似然性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号