...
首页> 外文期刊>Soil mechanics and foundation engineering >A corpus-based speech synthesis system with emotion
【24h】

A corpus-based speech synthesis system with emotion

机译:一种基于语料库的情感情感合成系统

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We propose a new approach to synthesizing emotional speech by a corpus-based concatenative speech synthesis system (ATR CHATR) using speech corpora of emotional speech. In this study, neither emotional-dependent prosody prediction nor signal processing per se is performed for emotional speech. Instead, a large speech corpus is created per emotion to synthesize speech with the appropriate emotion by simple switching between the emotional corpora. This is made possible by the normalization procedure incorporated in CHATR that transforms its standard predicted prosody range according to the source database in use. We evaluate our approach by creating three kinds of emotional speech corpus (anger, joy, and sadness) from recordings of a male and a female speaker of Japanese. The acoustic characteristics of each corpus are different and the emotions identifiable. The acoustic characteristics of each emotional utterance synthesized by our method show clear correlations to those of each corpus. Perceptual experiments using synthesized speech confirmed that our method can synthesize recognizably emotional speech. We further evaluated the method's intelligibility and the overall impression it gives to the listeners. The results show that the proposed method can synthesize speech with a high intelligibility and gives a favorable impression. With these encouraging results, we have developed a workable text-to-speech system with emotion to support the immediate needs of nonspeaking individuals. This paper describes the proposed method, the design and acoustic characteristics of the corpora, and the results of the perceptual evaluations.
机译:我们提出了一种新的方法,通过使用基于情感语料的语料库基于语料的连接语音合成系统(ATR CHATR)来合成情感语料。在这项研究中,对于情感语音,既不执行依赖于情感的韵律预测,也不执行信号处理本身。取而代之的是,为每个情感创建一个大型语音语料库,以通过在情感语料库之间进行简单切换来将语音与适当的情感合成。通过合并在CHATR中的归一化程序使之成为可能,该规程根据使用的源数据库转换其标准预测韵律范围。我们通过从讲日语的男性和女性的录音中创建三种情感语音语料库(愤怒,欢乐和悲伤)来评估我们的方法。每个语料库的声学特性都不同,并且可以识别出各种情感。通过我们的方法合成的每种情绪话语的声学特性都与各个语料库具有明显的相关性。使用合成语音的感知实验证实了我们的方法可以合成可识别的情感语音。我们进一步评估了该方法的清晰度以及它给听众的总体印象。结果表明,所提方法具有较高的清晰度,并且具有良好的印象。有了这些令人鼓舞的结果,我们开发了一种可行的带有语音功能的语音朗读系统,以支持不说话的人的即时需求。本文介绍了提出的方法,语料库的设计和声学特性,以及知觉评估的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号