首页> 外文会议>2010 7th International Symposium on Chinese Spoken Language Processing >Modeling prosody patterns for Chinese expressive text-to-speech synthesis
【24h】

Modeling prosody patterns for Chinese expressive text-to-speech synthesis

机译:汉语表达性文本到语音合成的韵律模式建模

获取原文

摘要

This paper proposes an approach for modeling the prosody patterns of the acoustic features for Chinese expressive text-to-speech (TTS) synthesis. Based on the observation that the speaker usually tends to put more emphasis on one particular syllable within a multi-syllabic prosodic word, we identify such syllable as the core syllable that can be derived from the semantic stress and tone information of the text prompt. We then classify the syllables in speech into four classes, based on their relations with the core syllable in a prosodic word. We analyze the contrastive (neutral versus expressive) speech recordings for each of four classes, and develop a perturbation model that takes into account the prosody pattern to transform neutral speech to expressive speech. Perceptual experiments on both neutral speech recordings and neutral TTS outputs involving 13 subjects indicate that the proposed approach can significantly enhance expressivity in synthesizing expressive speech.
机译:本文提出了一种方法来建模中文表达文本语音转换(TTS)的声学特征的韵律模式。基于这样的观察,说话者通常倾向于将重点放在多音节韵律词中的一个特定音节上,因此我们将这种音节确定为可以从文本提示的语义重音和音调信息中得出的核心音节。然后,根据语音与音节中核心音节的关系,将语音中的音节分为四类。我们分析了四个类别中每个类别的对比(中性与表达性)语音记录,并开发了一种考虑了韵律模式将中性语音转换为表达性语音的摄动模型。对涉及13个主题的中性语音记录和中性TTS输出的感知实验表明,所提出的方法可以在合成表达性语音中显着提高表达性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号