Emotional Voice Conversion Using Multitask Learning with Text-To-Speech

机译：使用多任务学习与文本到语音的情感语音转换

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Voice conversion (VC) is a task that alters the voice of a person to suit different styles while conserving the linguistic content. Previous state-of-the-art technology used in VC was based on the sequence-to-sequence (seq2seq) model, which could lose linguistic information. There was an attempt to overcome this problem using textual supervision; however, this required explicit alignment, and therefore the benefit of using seq2seq model was lost. In this study, a voice converter that utilizes multitask learning with text-to-speech (TTS) is presented. By using multitask learning, VC is expected to capture linguistic information and preserve the training stability. This method does not require explicit alignment for capturing abundant text information. Experiments on VC were performed on a male-Korean-emotional-text-speech dataset to convert the neutral voice to emotional voice. It was shown that multitask learning helps to preserve the linguistic content.

机译：语音转换（VC）是一个任务，它改变了一个人的声音，以适应不同风格的同时节省语言内容。在VC中使用的先前最先进的技术基于序列到序列（SEQ2SEQ）模型，这可能失去语言信息。试图使用文本监督来克服这个问题; 但是，这需要明确的对齐，因此使用SEQ2SEQ模型的好处丢失了。在本研究中，呈现了一种具有文本到语音（TTS）的多任务学习的语音转换器。通过使用多任务学习，预计VC将捕获语言信息并保留培训稳定性。此方法不需要显式对齐来捕获丰富的文本信息。对VC的实验是对男性 - 朝鲜情绪文本语音数据集进行的，以将中性语音转换为情绪声音。结果表明，多任务学习有助于保护语言内容。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|p7444-8063|共5页
会议地点
作者
Tae-Ho Kim; Sungjae Cho; Shinkook Choi; Sejik Park; Soo-Young Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
voice conversion; text-to-speech; emotional voice conversion; multitask learning;

机译：语音转换;文本到语音;情绪转换;多任务学习;

相似文献

外文文献
中文文献
专利

1. Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description [J] . Anna Pribilova, Jiri Pribil Speech Communication . 2006,第12期

机译：带有倒频谱描述的文本到语音系统中用于语音转换的非线性频率标度映射
2. Lost voices part 1: A narrative case study of two young men with learning disabilities disclosing experiences of sexual, emotional and physical abuse [J] . Digman Carmel British journal of learning disabilities . 2021,第2期

机译：失去的声音第1部分：两个年轻人具有学习残疾的叙事案例研究，揭示性，情感和身体虐待的经验
3. Political skill and emotional cue learning via voices: A training study [J] . Momm T., Blickle G., Liu Y. Journal of applied social psychology . 2013,第11期

机译：通过语音进行政治技巧和情感提示学习：一项培训研究
4. Emotional Voice Conversion Using Multitask Learning with Text-To-Speech [C] . Tae-Ho Kim, Sungjae Cho, Shinkook Choi, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：使用多任务学习和文本转语音进行情感语音转换
5. Posteriorgram-to-Acoustic Modeling for Unconstrained Voice Conversion with Deep Learning [D] . Sun, Lifa. 2017

机译：用于深度学习的无约束语音转换的后部图到声音建模
6. Adaptive and Longitudinal Pharmaceutical Care Instruction Using an Interactive Voice Response/Text-to-Speech System [O] . Gamal Hussein, Nancy Kawahara 2006

机译：使用交互式语音应答/文本语音转换系统的自适应和纵向药物护理指导
7. Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining [O] . Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, 2020

机译：语音变压器网络：使用变压器与文本到语音预先绘制的序列到序列语音转换
8. Migrating Dari Clustergen Flite Text-to-Speech Voice from Desktop to Android. [R] . Lee, M. H. 2014

机译：将Dari Clustergen Flite文本到语音转换从桌面迁移到android。

Emotional Voice Conversion Using Multitask Learning with Text-To-Speech

摘要

著录项

相似文献

相关主题

期刊订阅