...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units
【24h】

Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units

机译:通过最大化州和州长部队的联合概率来改善韵律生成

获取原文
获取原文并翻译 | 示例

摘要

The current state-of-the-art hidden Markov model (HMM)-based text-to-speech (TTS) can produce highly intelligible, synthesized speech with decent segmental quality. However, its prosody, especially at phrase or sentence level, still tends to be bland. This blandness is partially due to the fact that the state-based HMM is inadequate in capturing global, hierarchical suprasegmental information in speech signals. In this paper, to improve the TTS prosody, longer units are first explicitly modeled with appropriate parametric distributions. The resultant models are then integrated with the state-based baseline models in generating better prosody by maximizing the joint probability. Experimental results in both Mandarin and English show consistent improvements over our baseline system with only state-based prosody model. The improvements are both objectively measurable and subjectively perceivable.
机译:当前基于最新隐马尔可夫模型(HMM)的文本语音转换(TTS)可以产生具有清晰片段质量的高清晰度,合成语音。但是,它的韵律,尤其是短语或句子层面的韵律仍然趋于平淡。这种无聊的部分原因是基于状态的HMM不足以捕获语音信号中的全局,分层的超分段信息。在本文中,为了改善TTS韵律,首先使用适当的参数分布对较长的单元进行显式建模。然后将所得模型与基于状态的基线模型集成在一起,以通过使联合概率最大化来产生更好的韵律。以普通话和英语进行的实验结果表明,仅基于状态的韵律模型对我们的基准系统具有持续的改进。这些改进在客观上和主观上都是可以衡量的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号