首页> 外文期刊>Electronics and Communications in Japan. Part 2, Electronics >Multiple-Prosody Speech Databases and Their Effectiveness in High-Quality Speech Synthesis at Arbitrary Rates

Multiple-Prosody Speech Databases and Their Effectiveness in High-Quality Speech Synthesis at Arbitrary Rates


获取原文并翻译 | 示例


This paper discusses a method of high-quality speech synthesis in which the speech rate can be controlled in various ways. When the prosody is adjusted by the PSOLA method or by the synthesis-by-analysis method in the waveform segment connection process, the quality declines as the extent of modification increases. To deal with this problem, this paper proposes a method in which modification of the segment duration is reduced and quality degradation is alleviated by using a speech database for each speech rate. The proposed method has the following features. (1) Synthesized speech with the target speech rate is produced for each utterance, and is recorded. (2) Speech databases of the same text at different speech rates are constructed. In this study, speech databases at three different speech rates, fast, medium, and slow," were acquired. Speech at two different speech rates (fast and slow) was synthesized by using the acquired speech databases and by the conventional method (using a speech database at the standard speech rate). Listening experiments showed that the proposed method can synthesize higher-quality speech than the conventional method. When speech databases with different speech rates are combined, there is a danger that the speech quality may be degraded due to differences in voice quality among the databases. The effect of voice quality was investigated in a listening experiment, and was found to be within the tolerable range.
机译:本文讨论了一种高质量的语音合成方法,其中可以通过多种方式控制语速。当在波形段连接过程中通过PSOLA方法或通过分析合成方法来调整韵律时,质量会随着修改程度的增加而下降。为了解决这个问题,本文提出一种方法,其中通过使用每个语音速率的语音数据库来减少段持续时间的修改并减轻质量下降。所提出的方法具有以下特征。 (1)针对每种发声产生具有目标语速的合成语音,并进行记录。 (2)建立不同语音速率的相同文本的语音数据库。在这项研究中,获取了三种语言的语音数据库,分别是快,中,慢三种。通过使用获取的语音数据库和常规方法(使用语音数据库)。聆听实验表明,与传统方法相比,该方法可以合成更高质量的语音;将不同语音速率的语音数据库组合使用时,存在语音质量下降的危险。数据库之间语音质量的差异在一次收听实验中对语音质量的影响进行了调查,结果发现语音质量在可容忍的范围内。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号