Spectral Mapping Using Prior Re-Estimation of i-Vectors and System Fusion for Voice Conversion

Monisankha Pal; Goutam Saha

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Spectral Mapping Using Prior Re-Estimation of i-Vectors and System Fusion for Voice Conversion

【24h】

Spectral Mapping Using Prior Re-Estimation of i-Vectors and System Fusion for Voice Conversion

机译：使用i-Vector的预先重新估计和系统融合进行语音转换的频谱映射

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a new voice conversion (VC) method using i-vectors which consider low-dimensional representation of speech utterances. An attempt is made to restrict the i-vector variability in the intermediate computation of total variability (T) matrix by using a novel approach that uses modified-prior distribution of the intermediate i-vectors. This T-modification improves the speaker individuality conversion. For further improvement of conversion score and to keep a better balance between similarity and quality, band-wise spectrogram fusion between conventional joint density Gaussian mixture model (JDGMM) and i-vector based converted spectrograms is employed. The fused spectrogram retains more spectral details and leverages the complementary merits of each subsystem. Experiments in terms of objective and subjective evaluation are conducted extensively on CMU ARCTIC database. The results show that the proposed technique can produce a better trade-off between similarity and quality score than other state-of-the-art baseline VC methods. Furthermore, it works better than JDGMM in limited VC training data. The proposed VC performs moderately better (both objective and subjective) than mixture of factor analyzer based baseline VC. In addition, the proposed VC provides better quality converted speech as compared to maximum likelihood-GMM VC with dynamic feature constraint.

机译：在本文中，我们提出了一种新的使用i-vector的语音转换（VC）方法，该方法考虑了语音的低维表示。试图通过使用使用中间i向量的修改后的先验分布的新颖方法，在总可变性（T）矩阵的中间计算中限制i向量的可变性。这种T修饰可以改善说话者的个性转换。为了进一步提高转换得分并在相似度和质量之间保持更好的平衡，在常规联合密度高斯混合模型（JDGMM）和基于i-vector的转换光谱图之间采用了带状谱图融合。融合的频谱图保留了更多的频谱细节，并充分利用了每个子系统的互补优势。在CMU ARCTIC数据库上进行了客观和主观评估方面的实验。结果表明，与其他最新的基线VC方法相比，所提出的技术可以在相似度和质量得分之间产生更好的折衷。此外，在有限的VC培训数据中，它比JDGMM更好。与基于因子分析器的基线VC的混合相比，拟议的VC的性能（客观和主观）要好一些。另外，与具有动态特征约束的最大似然-GMM VC相比，所提出的VC提供了更好质量的语音转换。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2017年第11期|2071-2084|共14页
作者
Monisankha Pal; Goutam Saha;
展开▼
作者单位

Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, Kharagpur, India;

Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, Kharagpur, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech; Estimation; Spectrogram; Covariance matrices; Training data; Speech processing; Training;

机译：语音;估计;频谱图;协方差矩阵;训练数据;语音处理;训练;

相似文献

外文文献
中文文献
专利

1. Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization [J] . Ryo AIHARA, Ryoichi TAKASHIMA, Tetsuya TAKIGUCHI, IEICE transactions on information and systems . 2014,第6期

机译：基于非负矩阵分解的稀疏谱映射的强噪声语音转换
2. Spectral Mapping Using Kernel Principal Components Regression for Voice Conversion [J] . Peng SONGW, Li ZHAO, Yongqiang BAO Archives of acoustics . 2013,第1期

机译：使用内核主成分回归进行语音转换的频谱映射
3. Spectral Mapping Using Artificial Neural Networks for Voice Conversion [J] . Desai S., Black A. W., Yegnanarayana B., Audio, Speech, and Language Processing, IEEE Transactions on . 2010,第5期

机译：使用人工神经网络进行语音映射的频谱映射
4. Voice Conversion Using Spectral Mapping and TD-PSOLA [C] . Srinivasan Kannan, Pooja. R. Raju, R. Sai Surya Madhav, International Conference on Computing and Network Communications . 2020

机译：使用频谱映射和TD-PSOLA的语音转换
5. Benthic mapping of coastal waters using data fusion of hyperspectral imagery and airborne laser bathymetry. [D] . Lee, Mark Patrick. 2003

机译：使用高光谱图像和机载激光测深仪的数据融合对沿海水域进行底栖测绘。
6. Data Fusion of Two Hyperspectral Imaging Systems with Complementary Spectral Sensing Ranges for Blueberry Bruising Detection [O] . Shuxiang Fan, Changying Li, Wenqian Huang, 2018

机译：两个具有互补光谱传感范围的高光谱成像系统的数据融合用于蓝莓瘀青检测
7. Design And Evaluation Of A Voice Conversion Algorithm Based On Spectral Envelope Mapping And Residual Prediction [O] . Alexander Kain, Michael W. Macon 2001

机译：基于谱包络映射和残差预测的语音转换算法设计与评估

Spectral Mapping Using Prior Re-Estimation of i-Vectors and System Fusion for Voice Conversion

摘要

著录项

相似文献

相关主题

期刊订阅