Novel approach of MFCC based alignment and WD-residual modification for voice conversion using RBF

Nirmal Jagannath; Zaveri Mukesh; Patnaik Suprava; Kachare Pramod

首页> 外文期刊>Neurocomputing >Novel approach of MFCC based alignment and WD-residual modification for voice conversion using RBF

【24h】

Novel approach of MFCC based alignment and WD-residual modification for voice conversion using RBF

机译：基于RBF的基于MFCC的对齐和WD残差修改的新方法用于语音转换

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The voice conversion system modifies the speaker specific characteristics of the source speaker to that of the target speaker, so it perceives like target speaker. The speaker specific characteristics of the speech signal are reflected at different levels such as the shape of the vocal tract, shape of the glottal excitation and long term prosody. The shape of the vocal tract is represented by Line Spectral Frequency (LSF) and the shape of glottal excitation by Linear Predictive (LP) residuals. In this paper, the fourth level wavelet packet transform is applied to LP-residual to generate the sixteen sub-bands. This approach not only reduces the computational complexity but also presents a genuine transformation model over state of the art statistical prediction methods. In voice conversion, the alignment is an essential process which aligns the features of the source and target speakers. In this paper, the Mel Frequency Cepstrum Coefficients (MFCC) based warping path is proposed to align the LSF and LP-residual sub-bands using proposed constant source and constant target alignment. The conventional alignment technique is compared with two proposed approaches namely, constant source and constant target. Analysis shows that, constant source alignment using MFCC warping path performs slightly better than the constant target alignment and the state-of-the-art alignment approach. Generalized mapping models are developed for each sub-band using Radial Basis Function neural network (RBF) and are compared with Gaussian Mixture mapping model (GMM) and residual selection approach. Various subjective and objective evaluation measures indicate significant performance of RBF based residual mapping approach over the state-of-the-art approaches. (C) 2016 Elsevier B.V. All rights reserved.

机译：语音转换系统将源说话者的说话者特定特征修改为目标说话者的特征，因此感觉像目标说话者。语音信号的说话者特定特征在不同级别上得到反映，例如声道的形状，声门的激励形状和长期韵律。声道的形状由线频谱频率（LSF）表示，声门的激发形状由线性预测（LP）残差表示。本文将第四级小波包变换应用于LP残差生成16个子带。这种方法不仅降低了计算复杂性，而且在最先进的统计预测方法上提供了一种真正的转换模型。在语音转换中，对齐是将源扬声器和目标扬声器的功能对齐的重要过程。在本文中，提出了基于梅尔频率倒谱系数（MFCC）的翘曲路径，以使用建议的恒定源和恒定目标对准来对准LSF和LP残余子带。将常规的对准技术与两种提出的方法进行比较，即恒定光源和恒定目标。分析表明，使用MFCC翘曲路径进行恒定源对准比恒定目标对准和最新对准方法要好一些。使用径向基函数神经网络（RBF）为每个子带开发了通用的映射模型，并将其与高斯混合映射模型（GMM）和残差选择方法进行了比较。各种主观和客观的评估方法都表明，基于RBF的残差映射方法的性能优于最新方法。（C）2016 Elsevier B.V.保留所有权利。

著录项

来源
《Neurocomputing》 |2017年第may10期|39-49|共11页
作者
Nirmal Jagannath; Zaveri Mukesh; Patnaik Suprava; Kachare Pramod;
展开▼
作者单位

KJ Somaiya Coll Engn, Dept Elect Engn, Bombay 400077, Maharashtra, India;

SV Natl Inst Technol, Dept Comp Engn, Surat 395007, India;

SV Natl Inst Technol, Dept Elect Engn, Surat 395007, India;

Veermata Jeejabai Inst Technol, Dept Elect Engn, Bombay 400031, Maharashtra, India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Dynamic time warping; Gaussian mixture model; LP-residual; Line spectral frequencies; Mel frequency cepstrum coefficient; Radial basis function; Residual selection method and; Wavelet packet transform;

机译：动态时间规整高斯混合模型LP残差线谱频率梅尔频率倒谱系数径向基函数残差选择方法小波包变换;
入库时间 2022-08-18 02:05:58

相似文献

外文文献
中文文献
专利

1. High-quality voice conversion system based on GMM statistical parameters and RBF neural network [J] . CHEN Xian-tong, ZHANG Ling-hua 中国邮电高校学报（英文版） . 2014,第005期

机译：基于GMM统计参数和RBF神经网络的高质量语音转换系统
2. Non-parallel training for voice conversion using background-based alignment of GMMs and INCA algorithm [J] . Mostafa Ghorbandoost, Valiallah Saba Signal Processing, IET . 2017,第8期

机译：使用基于背景的GMM对齐和INCA算法进行语音转换的非并行训练
3. HMM-Based Maximum Likelihood Frame Alignment for Voice Conversion from a Nonparallel Corpus [J] . Ki-Seung LEE IEICE transactions on information and systems . 2017,第12期

机译：基于HMM的最大似然帧对齐，用于非平行语料库的语音转换
4. Voice Conversion System using SVM for Vocal Tract Modification and Codebook based Model for Pitch Contour Modification [C] . R. H. Laskar, F. A. Talukdar, Rajib Bhattacharjee, IEEE Region 10 Conference . 2008

机译：语音转换系统使用SVM进行声带修改和基于码本的音高轮廓修改模型
5. Hybrid organic/inorganic silicon-based sol-gel materials: A modification for scale-up conversion in anti-corrosion applications, and, A modification for in-situ synthesis of cadmium sulfide nanoparticles in optical applications. [D] . Tran, Tuan Thanh. 2011

机译：杂化有机/无机硅基溶胶-凝胶材料：在防腐蚀应用中用于按比例放大转化的改进方案，以及在光学应用中用于原位合成硫化镉纳米颗粒的改进方案。
6. Modification of an RBF ANN-Based Temperature Compensation Model of Interferometric Fiber Optical Gyroscopes [O] . Jianhua Cheng, Bing Qi, Daidai Chen, 2015

机译：基于RBF ANN的干涉式光纤陀螺仪温度补偿模型的修改
7. Eigenvoice-Based Approach to Voice Conversion and Voice Quality Control [O] . Tomoki Toda 2009

机译：基于特征语音的语音转换和语音质量控制方法

Novel approach of MFCC based alignment and WD-residual modification for voice conversion using RBF

摘要

著录项

相似文献

相关主题

期刊订阅