Comprehensive Voice Conversion Analysis Based on DGMM and Feature Combination

机译：基于DGMM和特征组合的综合语音转换分析。

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Voice conversion system modifies a speaker's voice to be perceived as another speaker uttered, and now it is widely used in many real applications. However, most research only focuses on one aspect performance of voice conversion system, rare theoretical analysis and experimental comparison on the whole source-target speaker voice conversion process has been introduced. Therefore, in this paper, a comprehensive analysis on source-target speaker voice conversion is conducted based on three key steps, including acoustic features selection and extraction, voice conversion model construction, and target speech synthesis, and a complete and optimal source-target speaker voice conversion is proposed. First, a comprehensive feature combination form consisting of prosodic feature, spectrum parameter and spectral envelope characteristic, is proposed. Then, to void the discontinuity and spectrum distortion of a converted speech, DGMM (Dynamic Gaussian Mixture Model) considering dynamic information between frames is presented. Subsequently, for speech synthesis, STRAIGHT algorithm synthesizer with feature combination is modified. Finally, the objective contrast experiment shows that our new source-target voice conversion process achieves better performance than the conventional methods. In addition, the speaker recognition system is also used to evaluate the quality of converted speech, and experimental result shows that the converted speech has higher target speaker individuality and speech quality.

机译：语音转换系统将说话者的语音修改为另一个说话者所感知的语音，现在它已广泛用于许多实际应用中。然而，大多数研究仅集中在语音转换系统的一方面性能上，对整个源-目标说话人语音转换过程进行了罕见的理论分析和实验比较。因此，本文基于声学特征选择与提取，语音转换模型构建，目标语音合成三个关键步骤，对源目标说话人语音转换进行了全面的分析，从而得到了一个完整，最优的源目标说话人。建议进行语音转换。首先，提出了由韵律特征，频谱参数和频谱包络特征组成的综合特征组合形式。然后，为了消除转换语音的不连续性和频谱失真，提出了考虑帧之间动态信息的DGMM（动态高斯混合模型）。随后，为了进行语音合成，对具有特征组合的STRAIGHT算法合成器进行了修改。最后，客观对比实验表明，我们新的源-目标语音转换过程比常规方法具有更好的性能。此外，说话人识别系统还用于评估转换后语音的质量，实验结果表明，转换后语音具有较高的目标说话人个性和语音质量。

著录项

来源
《Asia Modelling Symposium;Asia International Conference on Mathematical Modelling and Computer Simulation》|2014年|159-164|共6页
会议地点
作者
He Pan; Yangjie Wei; Nan Guan; Yi Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
DGMM; STRAIGHT synthesis; feature combination; speaker recognition; voice conversion;

机译：DGMM; STRAIGHT合成;功能组合;说话人识别;语音转换;

相似文献

外文文献
中文文献
专利

1. COMPREHENSIVE SOURCE-TARGET SPEAKER VOICE CONVERSION ANALYSIS [J] . He Pan, Yangjie Wei, Nan Guan, International journal of simulation: systems, science and technology . 2014,第6期

机译：综合的源目标说话人语音转换分析
2. Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion [J] . Wen-Chin Huang, Hao Luo, Hsin-Te Hwang, IEEE Transactions on Emerging Topics in Computational Intelligence . 2020,第4期

机译：基于变化的自动化器语音转换中的跨域特征和对逆势学习的无监督的表示解剖
3. DNN-Based Cross-Lingual Voice Conversion Using Bottleneck Features [J] . M. Kiran Reddy, K. Sreenivasa Rao Neural processing letters . 2020,第2期

机译：基于DNN的交叉语音转换使用瓶颈特征
4. Comprehensive Voice Conversion Analysis Based on DGMM and Feature Combination [C] . He Pan, Yangjie Wei, Nan Guan, Asia Modelling Symposium;Asia International Conference on Mathematical Modelling and Computer Simulation;Asia Modelling Symposium;Asia International Conference on Mathematical Modelling and Computer Simulation . 2014

机译：基于DGMM和特征组合的综合语音转换分析
5. Fragment-based protein active site analysis using Markov random field combinations of stereochemical feature-based classifications. [D] . Karkala, Reetal Pai. 2009

机译：使用基于立体化学特征的分类的马尔可夫随机场组合进行基于片段的蛋白质活性位点分析。
6. NIMG-70. THE EFFECT OF PATIENT AGE AT GLIOMA PRESENTATION ON MRI PHENOTYPE: A COMPREHENSIVE ANALYSIS OF VASARI-BASED FEATURE-SET CRITERIA IN 711 PATIENTS [O] . Devsmita Das, Byung Yoon, Louis Golden, 2017

机译：NIMG-70。胶质瘤患者的年龄对MRI表型的影响：基于VAARI的711例患者的特征集标准的综合分析
7. HIGH ACCURATE MODEL-INTEGRATION-BASED VOICE CONVERSION USING DYNAMIC FEATURES AND MODEL STRUCTURE OPTIMIZATION [O] . Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, 2013

机译：基于动态特征和模型结构优化的高精度基于模型集成的语音转换

Comprehensive Voice Conversion Analysis Based on DGMM and Feature Combination

摘要

著录项

相似文献

相关主题

期刊订阅