首页> 外文期刊>International journal of speech technology >Speech Database Design for a Concatenative Text-to-Speech Synthesis System for Individuals with Communication Disorders
【24h】

Speech Database Design for a Concatenative Text-to-Speech Synthesis System for Individuals with Communication Disorders

机译:沟通障碍者级联文本语音转换系统的语音数据库设计

获取原文
获取原文并翻译 | 示例
       

摘要

ATR's CHATR is a corpus-based text-to-speech (TTS) synthesis system that selects concatenation units from a natural speech database. The system's approach enables us to create a voice output communication aid (VOCA) using the voices of individuals who are anticipating the loss of phonatory functions. The advantage of CHATR is that individuals can use their own voice for communication even after vocal loss. This paper reports on a case study of the development of a VOCA using recordings of Japanese read speech (i.e., oral reading) from an individual with amyotrophic lateral sclerosis (ALS). In addition to using the individual's speech, we designed a speech database that could reproduce the characteristics of natural utterances in both general and specific situations. We created three speech corpora in Japanese to synthesize ordinary daily speech (i.e., in a normal speaking style): (1) a phonetically balanced sentence set, to assure that the system was able to synthesize all speech sounds; (2) readings of manuscripts, written by the same individual, for synthesizing talks regularly given as a source of natural intonation, articulation and voice quality; and (3) words and short phrases, to provide daily vocabulary entries for reproducing natural utterances in predictable situations. By combining one or more corpora, we were able to create four kinds of source database for CHATR synthesis. Using each source database, we synthesized speech from six test sentences. We selected the source database to use by observing selected units of synthesized speech and by performing perceptual experiments where we presented the speech to 20 Japanese native speakers. Analyzing the results of both observations and evaluations, we selected a source database compiled from all corpora. Incorporating CHATR, the selected source database, and an input acceleration function, we developed a VOCA for the individual to use in his daily life. We also created emotional speech source databases designed for loading separately to the VOCA in addition to the compiled speech database.
机译:ATR的CHATR是基于语料库的文本到语音(TTS)合成系统,可以从自然语音数据库中选择连接单元。该系统的方法使我们能够使用预期语音功能丧失的个人声音来创建语音输出通信帮助(VOCA)。 CHATR的优点是,即使在失去声音后,个人也可以使用自己的声音进行交流。本文报道了使用肌萎缩性侧索硬化症(ALS)的日语阅读语音(即口头阅读)录音来发展VOCA的案例研究。除了使用个人的语音,我们还设计了语音数据库,该数据库可以在一般和特定情况下重现自然言语的特征。我们用日语创建了三种语音语料库来合成日常的日常语音(即以正常的说话方式):(1)语音平衡的句子集,以确保系统能够合成所有语音。 (2)由同一人撰写的手稿的阅读材料,用于合成作为自然语调,发音和语音质量来源的定期演讲。 (3)单词和短短语,以提供日常词汇条目,以在可预测的情况下再现自然话语。通过组合一个或多个语料库,我们能够创建用于CHATR合成的四种源数据库。使用每个源数据库,我们根据六个测试语句合成语音。通过观察合成语音的选定单位并进行感知实验,我们将语音呈现给20位日语为母语的人,我们选择了源数据库。通过分析观察和评估的结果,我们选择了一个从所有语料库编译的源数据库。结合CHATR,选定的源数据库和输入加速功能,我们为个人开发了VOCA,供他的日常生活使用。我们还创建了情感语音源数据库,除了编译的语音数据库外,还设计用于分别加载到VOCA。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号