Emotional Voice Conversion Using Multitask Learning with Text-To-Speech

机译：使用多任务学习和文本转语音进行情感语音转换

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Voice conversion (VC) is a task that alters the voice of a person to suit different styles while conserving the linguistic content. Previous state-of-the-art technology used in VC was based on the sequence-to-sequence (seq2seq) model, which could lose linguistic information. There was an attempt to overcome this problem using textual supervision; however, this required explicit alignment, and therefore the benefit of using seq2seq model was lost. In this study, a voice converter that utilizes multitask learning with text-to-speech (TTS) is presented. By using multitask learning, VC is expected to capture linguistic information and preserve the training stability. This method does not require explicit alignment for capturing abundant text information. Experiments on VC were performed on a male-Korean-emotional-text-speech dataset to convert the neutral voice to emotional voice. It was shown that multitask learning helps to preserve the linguistic content.

机译：语音转换（VC）是一项任务，可在保留语言内容的同时更改人的语音以适应不同的风格。 VC中使用的最新技术基于序列到序列（seq2seq）模型，该模型可能会丢失语言信息。人们试图通过文本监督来克服这个问题。但是，这需要明确的对齐方式，因此失去了使用seq2seq模型的好处。在这项研究中，提出了一种利用文本语音转换（TTS）的多任务学习的语音转换器。通过使用多任务学习，VC有望捕获语言信息并保持训练的稳定性。此方法不需要显式对齐即可捕获大量文本信息。在男性韩语情感文本语音数据集上进行了VC实验，将中性语音转换为情感语音。结果表明，多任务学习有助于保留语言内容。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|7774-7778|共5页
会议地点
作者
Tae-Ho Kim; Sungjae Cho; Shinkook Choi; Sejik Park; Soo-Young Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
voice conversion; text-to-speech; emotional voice conversion; multitask learning;

机译：语音转换;文本到语音;情感语音转换;多任务学习;

相似文献

外文文献
中文文献
专利

1. Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description [J] . Anna Pribilova, Jiri Pribil Speech Communication . 2006,第12期

机译：带有倒频谱描述的文本到语音系统中用于语音转换的非线性频率标度映射
2. Lost voices part 1: A narrative case study of two young men with learning disabilities disclosing experiences of sexual, emotional and physical abuse [J] . Digman Carmel British journal of learning disabilities . 2021,第2期

机译：失去的声音第1部分：两个年轻人具有学习残疾的叙事案例研究，揭示性，情感和身体虐待的经验
3. Political skill and emotional cue learning via voices: A training study [J] . Momm T., Blickle G., Liu Y. Journal of applied social psychology . 2013,第11期

机译：通过语音进行政治技巧和情感提示学习：一项培训研究
4. Emotional Voice Conversion Using Multitask Learning with Text-To-Speech [C] . Tae-Ho Kim, Sungjae Cho, Shinkook Choi, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：使用多任务学习与文本到语音的情感语音转换
5. Posteriorgram-to-Acoustic Modeling for Unconstrained Voice Conversion with Deep Learning [D] . Sun, Lifa. 2017

机译：用于深度学习的无约束语音转换的后部图到声音建模
6. Adaptive and Longitudinal Pharmaceutical Care Instruction Using an Interactive Voice Response/Text-to-Speech System [O] . Gamal Hussein, Nancy Kawahara 2006

机译：使用交互式语音应答/文本语音转换系统的自适应和纵向药物护理指导
7. Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining [O] . Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, 2020

机译：语音变压器网络：使用变压器与文本到语音预先绘制的序列到序列语音转换
8. Migrating Dari Clustergen Flite Text-to-Speech Voice from Desktop to Android. [R] . Lee, M. H. 2014

机译：将Dari Clustergen Flite文本到语音转换从桌面迁移到android。

Emotional Voice Conversion Using Multitask Learning with Text-To-Speech

摘要

著录项

相似文献

相关主题

期刊订阅