基于隐变量模型的语音转换方法研究

孙新建; 张雄伟; 杨吉斌; 曹铁勇; 孙健

首页> 中文期刊> 《信号处理》 >基于隐变量模型的语音转换方法研究

基于隐变量模型的语音转换方法研究

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Conventional voice conversion is to find a mapping from source acoustic features to those of the target, which is prone to cause over-smoothing and over-fitting phenomena. This paper proposes a novel strategy for voice conversion from the point of view of style and content separation, which is solved by a two-factor Latent Variable Model ( LVM). Firstly, a generative model in terms of style and content is developed using a LVM with two low-dimensional latent factors, and the interactions between the two factors are captured by a set of basis mapping functions that relates low-dimensional latent spaces to a high-dimensional observation space. Secondly, through the model fitting, the observations of speech spectrum are decomposed into style and content factors that represent the speaker identity and phonetic information respectively, and the model parameters are also estimated. Lastly, the desired converted speech is reconstructed with the target identity style and the source phonetic content using the learned model as a prior. Objective and subjective test results showed that, compared to the traditional GMM mapping method, the proposed system results in an increased performance with limited size of training data. Further experiments showed that the LVM with nonlinear basis mapping functions is preferable to the Bilinear Model for voice conversion task.%传统语音转换方法利用说话人声音特征映射实现,容易造成过平滑(over-smoothing)和过拟合(over-fitting)问题.本文从语音信号内容与形式分离角度,利用隐变量模型提出了一种全新的语音转换方法.首先利用包含两个隐变量因子的隐变量模型(Latent Variable Model,LVM)建立语音信号的生成模型；然后采用最大似然方法把语音信号分解成表示语义的内容信息和体现说话人特征的形式信息,并估计出模型参数；最后基于LVM生成模型,利用说话人形式替换方法实现语音转换.主、客观测试结果表明,在相同训练集条件下,本文提出的语音转换方法性能明显优于GMM方法,并且隐变量模型和传统的双线性模型(Bilinear Model)相比,由于采用非线性关系描述内容与形式之间的相互作用,因此分离效果更好,语音转换质量更高.

著录项

来源
《信号处理》 |2012年第3期|344-351|共8页
作者
孙新建; 张雄伟; 杨吉斌; 曹铁勇; 孙健;
展开▼
作者单位

解放军理工大学通信工程学院,江苏,南京,210007;

解放军理工大学指挥自动化学院,江苏,南京,210016;

解放军理工大学指挥自动化学院,江苏,南京,210016;

解放军理工大学指挥自动化学院,江苏,南京,210016;

解放军理工大学通信工程学院,江苏,南京,210007;

展开▼
原文格式 PDF
正文语种 chi
中图分类语音信号处理;
关键词
语音转换; 隐变量模型; 内容与形式; 分离; 形式替换;

相似文献

中文文献
外文文献
专利

1. 基于STRAIGHT模型和ANN的语音转换方法研究 [J] . 王光艳 ,高丽萍 ,黄奕婷 . 新一代信息技术 . 2020,第022期
2. 基于多变量检测限的模型变量筛选方法研究 [J] . 彭严芳 ,史新元 ,李洋 . 世界科学技术-中医药现代化 . 2014,第005期
3. 基于深度信念网与隐变量模型的用户偏好建模 [J] . 潘良辰 ,吴鑫然 ,岳昆 . 计算机工程 . 2020,第005期
4. 基于高斯过程隐变量模型的滚动轴承故障识别 [J] . 尹爱军 ,石波 ,谭建 . 噪声与振动控制 . 2020,第006期
5. 评价数据中的用户偏好建模:一种基于隐变量模型的方法 [J] . 雷震 ,阚伊戎 ,孙正宝 . 云南大学学报：自然科学版 . 2019,第4期
6. 基于高斯过程隐变量模型和条件随机场的人体动作识别 [C] . Cai Linqin ,蔡林沁 ,Liu Xiaolin . 2017中国计算机辅助设计与图形学大会（2017 China CADCG） . 2017
7. 基于隐变量模型的歌曲转换方法研究 [A] . 黄斐 . 2015

基于隐变量模型的语音转换方法研究

摘要

著录项

相似文献

相关主题

期刊订阅