Clustering of duration patterns in speech for Text-to-Speech Synthesis

机译：语音中的持续时间模式聚类，用于文本到语音合成

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Synthesis of natural sounding speech is the greatest challenge in a Text-to-Speech Synthesis (TTS) system. In natural speech, duration, intensity and pitch are dynamically varied which is manifested as rhythm or prosody of speech. If these variations are not recreated, the synthesized speech will sound robotic. Synthesis of good quality speech depends on how well the duration and intonation patterns are imposed on speech segments. The best way to improve naturalness in speech is to mimic the way human brain imposes rhythm. We speak in a particular style by varying the duration of the speech segments in words and phrases as per certain specific duration patterns. Brain might be retrieving the corresponding patterns at the time of speaking for generating a discourse in a particular style (news reading, bible reading, story telling etc.). The main objective of this work is to investigate the existence of duration patterns in natural speech using cluster analysis. Speech uttered in Malayalam, an Indian language was taken for analysis. Cluster analysis was done on isolated words, as well as on words and phrases in continuous speech. Results of cluster analysis when observed using silhouette plot showed the existence of duration patterns in speech.

机译：在文本语音合成（TTS）系统中，自然声音语音的合成是最大的挑战。在自然语音中，持续时间，强度和音高会动态变化，表现为语音的节奏或韵律。如果未重新创建这些变体，则合成语音将听起来很机器人。高质量语音的合成取决于将持续时间和语调模式施加到语音段的程度。改善言语自然性的最佳方法是模仿人脑施加节奏的方式。我们通过按照某些特定的持续时间模式来改变单词和短语中语音段的持续时间，从而以一种特殊的方式说话。大脑在讲话时可能正在检索相应的模式，以产生特定风格的话语（新闻阅读，圣经阅读，讲故事等）。这项工作的主要目的是使用聚类分析来研究自然语音中持续时间模式的存在。在马拉雅拉姆语中讲话时，印度语被用来分析。聚类分析是针对孤立的单词以及连续语音中的单词和短语进行的。使用轮廓图观察时的聚类分析结果表明语音中存在持续时间模式。

著录项

来源
《2012 Annual IEEE India Conference.》|2012年|p.1122-1127|共6页
会议地点 Kochi(IN);Kochi(IN)
作者
Sreelekshmi K.S.; Gopinath Deepa P.;
展开▼
作者单位

Department of Electronics and Communication, College of Engineering, Trivandrum, Kerala, India - 695017;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;计算技术、计算机技术;
关键词
Speech synthesis; cluster analysis; duration models; k-means clustering; silhouette plot;

机译：语音合成;聚类分析;持续时间模型; k-均值聚类;轮廓图;;

相似文献

外文文献
中文文献
专利

1. Optimal state duration assignment in hidden Markov model-based text-to-speech synthesis system [J] . Khan Najeeb Ullah, Jung-Chul Lee Electronics Letters . 2015,第12期

机译：基于隐马尔可夫模型的文本语音合成系统中的最佳状态持续时间分配
2. Durational Evidence for Syllable Boundary of/ and /1/ in Text-to-Speech Synthesis [J] . Fang Tian Journal of Multimedia . 2013,第2期

机译：语音合成中/ n /和/ 1 /音节边界的持续证据
3. A fuzzy decision tree-based duration model for Standard Yorùbá text-to-speech synthesis [J] . Odetunji A. Odejobi, Shun Ha Sylvia Wong, Anthony J. Beaumont Computer speech and language . 2007,第2期

机译：标准约鲁巴语文本到语音合成的基于模糊决策树的持续时间模型
4. Acoustic Durational Properties of Sonorant as Syllable Boundaries in Text-to-Speech Synthesis [C] . Tian Fang International conference on green communications and networks . 2013

机译：语音合成文本中语音音节边界的声音持续时间特性。
5. Investigating the patterns of text-to-speech software use by adolescent struggling readers: An embedded multiple case study. [D] . Takahashi, Kiriko. 2015

机译：研究陷入困境的青少年阅读器使用的文本到语音软件的模式：嵌入式多案例研究。
6. FIXED TEMPORAL PATTERNS IN CHILDREN’S SPEECH DESPITE VARIABLEVOWEL DURATIONS [O] . Melissa A. Redford, Grace E. Oh -1

机译：儿童语音变量的固定时间模式持续时间
7. Bayesian modelling of vowel segment duration for text-to-speech synthesis using distinctive features [O] . Goubanova Olga V 2003

机译：贝叶斯元音段持续时间的建模，使用独特功能进行文本到语音合成

Clustering of duration patterns in speech for Text-to-Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅