首页> 外文期刊>IEEE transactions on audio, speech and language processing >Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis
【24h】

Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis

机译:使用持续时间嵌入的双HMM进行语音转换以实现表达性语音合成

获取原文
获取原文并翻译 | 示例

摘要

This paper presents an expressive voice conversion model (DeBi-HMM) as the post processing of a text-to-speech (TTS) system for expressive speech synthesis. DeBi-HMM is named for its duration-embedded characteristic of the two HMMs for modeling the source and target speech signals, respectively. Joint estimation of source and target HMMs is exploited for spectrum conversion from neutral to expressive speech. Gamma distribution is embedded as the duration model for each state in source and target HMMs. The expressive style-dependent decision trees achieve prosodic conversion. The STRAIGHT algorithm is adopted for the analysis and synthesis process. A set of small-sized speech databases for each expressive style is designed and collected to train the DeBi-HMM voice conversion models. Several experiments with statistical hypothesis testing are conducted to evaluate the quality of synthetic speech as perceived by human subjects. Compared with previous voice conversion methods, the proposed method exhibits encouraging potential in expressive speech synthesis.
机译:本文提出了一种表达语音转换模型(DeBi-HMM),作为表达语音合成的文本语音转换(TTS)系统的后处理。 DeBi-HMM因其两个HMM的持续时间嵌入特性而得名,分别用于对源和目标语音信号进行建模。源HMM和目标HMM的联合估计可用于从中性语音到表达性语音的频谱转换。 Gamma分布被嵌入为源HMM和目标HMM中每个状态的持续时间模型。表现风格相关的决策树可实现韵律转换。分析和综合过程采用STRAIGHT算法。设计并收集了每种表达方式的一组小型语音数据库,以训练DeBi-HMM语音转换模型。进行了一些具有统计假设检验的实验,以评估人类受试者感知到的合成语音的质量。与以前的语音转换方法相比,该方法在表达性语音合成中具有令人鼓舞的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号