首页> 外文会议>2012 Annual IEEE India Conference. >Clustering of duration patterns in speech for Text-to-Speech Synthesis
【24h】

Clustering of duration patterns in speech for Text-to-Speech Synthesis

机译:语音中的持续时间模式聚类,用于文本到语音合成

获取原文
获取原文并翻译 | 示例

摘要

Synthesis of natural sounding speech is the greatest challenge in a Text-to-Speech Synthesis (TTS) system. In natural speech, duration, intensity and pitch are dynamically varied which is manifested as rhythm or prosody of speech. If these variations are not recreated, the synthesized speech will sound robotic. Synthesis of good quality speech depends on how well the duration and intonation patterns are imposed on speech segments. The best way to improve naturalness in speech is to mimic the way human brain imposes rhythm. We speak in a particular style by varying the duration of the speech segments in words and phrases as per certain specific duration patterns. Brain might be retrieving the corresponding patterns at the time of speaking for generating a discourse in a particular style (news reading, bible reading, story telling etc.). The main objective of this work is to investigate the existence of duration patterns in natural speech using cluster analysis. Speech uttered in Malayalam, an Indian language was taken for analysis. Cluster analysis was done on isolated words, as well as on words and phrases in continuous speech. Results of cluster analysis when observed using silhouette plot showed the existence of duration patterns in speech.
机译:在文本语音合成(TTS)系统中,自然声音语音的合成是最大的挑战。在自然语音中,持续时间,强度和音高会动态变化,表现为语音的节奏或韵律。如果未重新创建这些变体,则合成语音将听起来很机器人。高质量语音的合成取决于将持续时间和语调模式施加到语音段的程度。改善言语自然性的最佳方法是模仿人脑施加节奏的方式。我们通过按照某些特定的持续时间模式来改变单词和短语中语音段的持续时间,从而以一种特殊的方式说话。大脑在讲话时可能正在检索相应的模式,以产生特定风格的话语(新闻阅读,圣经阅读,讲故事等)。这项工作的主要目的是使用聚类分析来研究自然语音中持续时间模式的存在。在马拉雅拉姆语中讲话时,印度语被用来分析。聚类分析是针对孤立的单词以及连续语音中的单词和短语进行的。使用轮廓图观察时的聚类分析结果表明语音中存在持续时间模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号