Minimum Text Corpus Selection for Limited Domain Speech Synthesis

机译：限量域语音合成的最低文本语料库选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper concerns limited domain TTS system based on the con-catenative method, and presents an algorithm capable to extract the minimal domain-oriented text corpus from the real data of the given domain, while still reaching the maximum coverage of the domain. The proposed approach ensures that the least amount of texts are extracted, containing the most common phrases and (possibly) all the words from the domain. At the same time, it ensures that appropriate phrase overlapping is kept, allowing to find smooth concatenation in the overlapped regions to reach high quality synthesized speech. In addition, several recommendations allowing a speaker to record the corpus more fluently and comfortably are presented and discussed. The corpus building is tested and evaluated on several domains differing in size and nature, and the authors present the results of the algorithm and demonstrate the advantages of using the domain oriented corpus for speech synthesis.

机译：本文涉及基于配置方法的有限域TTS系统，并提出了一种能够从给定域的实际数据提取最小域的文本语料库的算法，同时仍然达到域的最大覆盖范围。所提出的方法可确保提取最少的文本，其中包含最常见的短语和（可能）来自域中的所有单词。同时，它确保保持适当的短语重叠，允许在重叠区域中找到平滑的连接以达到高质量的合成语音。此外，允许发言者更流利和舒适地讨论扬声器的若干建议，并讨论并讨论。在大小和性质不同的域测试和评估语料库建筑物，作者呈现了算法的结果，并展示了使用域导向语料库的语音合成的优点。

著录项

来源
《International Conference on Text, Speech and Dialogue》|2014年||共10页
会议地点
作者
Marketa Juzova; Daniel Tihelka;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.1-53;
关键词
Limited domain speech synthesis; Concatenative speech synthesis; Text corpus; Speech units; Text chunks; Unit concatenation;

机译：有限的域语音合成;阶级语音合成;文本语料库;语音单位;文本块;单位级联;

相似文献

外文文献
中文文献
专利

1. GRADIENT-DESCENT BASED UNIT-SELECTION OPTIMIZATION ALGORITHM USED FOR CORPUS-BASED TEXT-TO-SPEECH SYNTHESIS [J] . Matej Rojc, Zdravko Kacic Applied Artificial Intelligence . 2011,第5a7期

机译：基于语料库的语篇合成中基于梯度下降的单元选择优化算法
2. Emilia: a speech corpus for Argentine Spanish text to speech synthesis [J] . Torres Humberto M., Gurlekian Jorge A., Evin Diego A., Language Resources and Evaluation . 2019,第3期

机译：艾米利亚：阿根廷语文本到语音合成的语音语料库
3. Emilia: a speech corpus for Argentine Spanish text to speech synthesis [J] . Torres Humberto M., Gurlekian Jorge A., Evin Diego A., Language Resources and Evaluation . 2019,第3期

机译：艾米利亚：阿根廷西班牙语文本给语音合成的语音语料库
4. Minimum Text Corpus Selection for Limited Domain Speech Synthesis [C] . Marketa Juzova, Daniel Tihelka International conference on text, speech and dialogue . 2014

机译：有限域语音合成的最小文本语料库选择
5. Hidden Markov models for visual speech synthesis in limited data environments. [D] . Arb, Harold Allan. 2001

机译：用于有限数据环境中视觉语音合成的隐马尔可夫模型。
6. A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text [O] . Ying Xiong, Zhongmin Wang, Dehuan Jiang, 2019

机译：用于临床文本的细粒度中文分词和词性标注语料库
7. Recording and Annotation of Speech Corpus for Czech Unit Selection Speech Synthesis ⋆ [O] . Jan Romportl 2008

机译：捷克语单元选择语音合成的语音语料库记录和注释An

Minimum Text Corpus Selection for Limited Domain Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅