首页> 外文期刊>Circuits, systems, and signal processing >Parameterization of Excitation Signal for Improving the Quality of HMM-Based Speech Synthesis System
【24h】

Parameterization of Excitation Signal for Improving the Quality of HMM-Based Speech Synthesis System

机译:激励信号的参数化以提高基于HMM的语音合成系统的质量

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a new approach of parameterizing the excitation signal for improving the quality of HMM-based speech synthesis system. The proposed method tries to model the excitation or residual signal by segregating the regions of the residual signal based on their perceptual importance. Initially, a study on the characteristics of the residual signal around glottal closure instant (GCI) is performed using principal component analysis (PCA). Based on the present study, and from the previous literature (Adiga and Prasanna in Proceedings of Interspeech, pp 1677-1681, 2013; Cabral in Proceedings of Interspeech, pp 1082-1086, 2013), it is concluded that the segment of the residual signal around GCI which carries perceptually important information is considered as the deterministic component and the remaining part of the residual signal is considered as the noise component. The deterministic component is compactly represented using PCA coefficients (with about 95% accuracy), and the noise component is parameterized in terms of spectral and amplitude envelopes. The proposed excitation modeling approach is incorporated in the HMM-based speech synthesis system. Subjective evaluation results show a significant improvement of quality for both female and male speakers' speech synthesized by the proposed method, compared to three existing excitation modeling methods. Accurate parameterization of the segment of the residual signal around GCI resulted in the improvement of the quality of the synthesized speech. Synthesized speech samples of the proposed and existing source models are made available online at http://www.sit.iitkgp.ernet.in/similar to ksrao/parametric-hts/pcd-hts.html.
机译:本文提出了一种参数化激励信号的新方法,以提高基于HMM的语音合成系统的质量。所提出的方法试图通过根据残余信号的感知重要性隔离残余信号的区域来对激励信号或残余信号进行建模。最初,使用主成分分析(PCA)对声门闭合瞬间(GCI)周围残留信号的特性进行了研究。根据目前的研究,并根据先前的文献(《阿迪加和普拉桑纳》,《 Interspeech议事录》,第1677-1681页,2013年;《 Cabral》在《 Interspeech的议事录》,第1082-1​​086页,2013年)得出结论,残余部分GCI周围带有感知重要信息的信号被视为确定性分量,剩余信号的其余部分被视为噪声分量。确定性分量使用PCA系数(准确度约为95%)紧凑地表示,并且噪声分量根据频谱和幅度包络进行参数化。所提出的激励建模方法被并入基于HMM的语音合成系统中。主观评估结果表明,与现有的三种激励建模方法相比,通过该方法合成的男女说话者的语音质量均得到了显着改善。围绕GCI的残差信号段的准确参数化导致合成语音质量的提高。与ksrao / parametric-hts / pcd-hts.html类似,可以在http://www.sit.iitkgp.ernet.in/like上在线获得建议的模型和现有的源模型的合成语音样本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号