A Voice Conversion Mapping Function based on a Stacked Joint-Autoencoder

机译：基于堆叠的联合AutoEncoder的语音转换映射函数

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this study, we propose a novel method for training a regression function and apply it to a voice conversion task. The regression function is constructed using a Stacked Joint-Autoencoder (SJAE). Previously, we have used a more primitive version of this architecture for pre-training a Deep Neural Network (DNN). Using objective evaluation criteria, we show that the lower levels of the SJAE perform best with a low degree of jointness, and higher levels with a higher degree of jointness. We demonstrate that our proposed approach generates features that do not suffer from the averaging effect inherent in back-propagation training. We also carried out subjective listening experiments to evaluate speech quality and speaker similarity. Our results show that the SJAE approach has both higher quality and similarity than a SJAE+DNN approach, where the SJAE is used for pre-training a DNN, and the fine-tuned DNN is then used for mapping. We also present the system description and results of our submission to Voice Conversion Challenge 2016.

机译：在本研究中，我们提出了一种训练回归函数的新方法，并将其应用于语音转换任务。回归函数使用堆叠的联合自动码器（Sjae）构造。以前，我们使用了更原始的这种架构版本，用于预先培训深度神经网络（DNN）。使用客观评估标准，我们表明SJAE的较低水平最佳地以低程度的关联，更高的程度，具有较高程度的关节。我们展示了我们所提出的方法产生不遭受后传播训练中固有的平均效果的特征。我们还开展了主观聆听实验，以评估语音质量和扬声器相似性。我们的结果表明，Sjae方法具有比Sjae + DNN方法更高的质量和相似性，其中Sjae用于预先训练DNN，然后将微调DNN用于映射。我们还提出了我们提交给2016的语音转换挑战的系统描述和结果。

著录项

来源
《Annual Conference of the International Speech Communication Association》|2016年|p1532-2317|共5页
会议地点
作者
Seyed Hamidreza Mohammadi; Alexander Kain;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TB95-53;
关键词
入库时间 2022-08-21 11:41:05

相似文献

外文文献
中文文献
专利

1. Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization [J] . Ryo AIHARA, Ryoichi TAKASHIMA, Tetsuya TAKIGUCHI, IEICE transactions on information and systems . 2014,第6期

机译：基于非负矩阵分解的稀疏谱映射的强噪声语音转换
2. HMM-based Speech Synthesis with Multiple Individual Voices using Exemplar-based Voice Conversion [J] . Trung-Nghia Phung International journal of computer science and network security . 2017,第5期

机译：使用基于示例的语音转换，具有多个单独语音的基于HMM的语音合成
3. Interpretable parametric voice conversion functions based on Gaussian mixture models and constrained transformations [J] . Daniel Erro, Agustin Alonso, Luis Serrano, Computer speech and language . 2015,第1期

机译：基于高斯混合模型和约束变换的可解释参数语音转换功能
4. A Voice Conversion Mapping Function based on a Stacked Joint-Autoencoder [C] . Seyed Hamidreza Mohammadi, Alexander Kain Annual Conference of the International Speech Communication Association . 2016

机译：基于堆叠的联合AutoEncoder的语音转换映射函数
5. Methods for rapid EEG-based mapping of human brain functions. [D] . Barkan, Helen Irene. 2002

机译：基于快速脑电图绘制人脑功能的方法。
6. Arctic corridors and northern voices project: Methods for community-based participatory mapping for low impact shipping corridors in Arctic Canada [O] . Jackie Dawson, Natalie Ann Carter, Nicolien van Luijk, 2020

机译：北极走廊和北方语音项目：北极加拿大低影响航运走廊的社区参与式映射方法
7. HMM-BASED SEQUENCE-TO-FRAME MAPPING FOR VOICE CONVERSION [O] . Yu Qiao, Daisuke Saito, Nobuaki Minematsu 2015

机译：用于语音转换的基于Hmm的序列到帧映射
8. Non invasive Imaging based Detection and Mapping of Brain Oxidative Stress and itsCorrelation with Cognative Functions. [R] . Mandal, P. 2017

机译：基于非侵入性成像的脑氧化应激检测与定位及其与认知功能的相关性。

A Voice Conversion Mapping Function based on a Stacked Joint-Autoencoder

摘要

著录项

相似文献

相关主题

期刊订阅