首页> 外文会议>International conference on statistical language and speech processing >Residual-Based Excitation with Continuous FO Modeling in HMM-Based Speech Synthesis
【24h】

Residual-Based Excitation with Continuous FO Modeling in HMM-Based Speech Synthesis

机译:基于HMM的语音合成中基于残差的连续FO建模激励

获取原文

摘要

In statistical parametric speech synthesis, creaky voice can cause disturbing artifacts. The reason is that standard pitch tracking algorithms tend to erroneously measure FO in regions of creaky voice. This pattern is learned during training of hidden Markov-models (HMMs). In the synthesis phase, false voiced/unvoiced decision caused by creaky voice results in audible quality degradation. In order to eliminate this phenomena, we use a simple continuous FO tracker which does not apply a strict voiced/unvoiced decision. In the proposed residual-based vocoder, Maximum Voiced Frequency is used for mixed voiced and unvoiced excitation. As all parameters of the vocoder are continuous, Multi-Space Distribution is not necessary during training the HMMs, which has been shown to be advantageous. Artifacts caused by creaky voice are eliminated with this speech synthesis system. A subjective listening test of English utterances has shown improvement over the traditional excitation.
机译:在统计参数语音合成中,吱吱作响的语音会引起令人不快的伪影。原因是标准的音调跟踪算法往往会错误地测量发声嘶哑的区域中的FO。这种模式是在训练隐马尔可夫模型(HMM)期间学习的。在合成阶段,由于声音嘎吱作响而导致的错误的浊音/清音决策会导致可听质量下降。为了消除这种现象,我们使用了一个简单的连续FO跟踪器,该跟踪器没有应用严格的浊音/清音决定。在提出的基于残差的声码器中,最大浊音频率用于混合浊音和非浊音激励。由于声码器的所有参数都是连续的,因此在训练HMM时无需进行多空间分配,这已证明是有利的。用这种语音合成系统消除了由吱吱作响的声音引起的伪影。主观的英语话语听力测试显示,与传统的激发相比有所改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号