首页> 外文会议>Asia Modelling Symposium;Asia International Conference on Mathematical Modelling and Computer Simulation >Comprehensive Voice Conversion Analysis Based on DGMM and Feature Combination
【24h】

Comprehensive Voice Conversion Analysis Based on DGMM and Feature Combination

机译:基于DGMM和特征组合的综合语音转换分析。

获取原文

摘要

Voice conversion system modifies a speaker's voice to be perceived as another speaker uttered, and now it is widely used in many real applications. However, most research only focuses on one aspect performance of voice conversion system, rare theoretical analysis and experimental comparison on the whole source-target speaker voice conversion process has been introduced. Therefore, in this paper, a comprehensive analysis on source-target speaker voice conversion is conducted based on three key steps, including acoustic features selection and extraction, voice conversion model construction, and target speech synthesis, and a complete and optimal source-target speaker voice conversion is proposed. First, a comprehensive feature combination form consisting of prosodic feature, spectrum parameter and spectral envelope characteristic, is proposed. Then, to void the discontinuity and spectrum distortion of a converted speech, DGMM (Dynamic Gaussian Mixture Model) considering dynamic information between frames is presented. Subsequently, for speech synthesis, STRAIGHT algorithm synthesizer with feature combination is modified. Finally, the objective contrast experiment shows that our new source-target voice conversion process achieves better performance than the conventional methods. In addition, the speaker recognition system is also used to evaluate the quality of converted speech, and experimental result shows that the converted speech has higher target speaker individuality and speech quality.
机译:语音转换系统将说话者的语音修改为另一个说话者所感知的语音,现在它已广泛用于许多实际应用中。然而,大多数研究仅集中在语音转换系统的一方面性能上,对整个源-目标说话人语音转换过程进行了罕见的理论分析和实验比较。因此,本文基于声学特征选择与提取,语音转换模型构建,目标语音合成三个关键步骤,对源目标说话人语音转换进行了全面的分析,从而得到了一个完整,最优的源目标说话人。建议进行语音转换。首先,提出了由韵律特征,频谱参数和频谱包络特征组成的综合特征组合形式。然后,为了消除转换语音的不连续性和频谱失真,提出了考虑帧之间动态信息的DGMM(动态高斯混合模型)。随后,为了进行语音合成,对具有特征组合的STRAIGHT算法合成器进行了修改。最后,客观对比实验表明,我们新的源-目标语音转换过程比常规方法具有更好的性能。此外,说话人识别系统还用于评估转换后语音的质量,实验结果表明,转换后语音具有较高的目标说话人个性和语音质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号