首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Emotional Voice Conversion Using Multitask Learning with Text-To-Speech
【24h】

Emotional Voice Conversion Using Multitask Learning with Text-To-Speech

机译:使用多任务学习和文本转语音进行情感语音转换

获取原文

摘要

Voice conversion (VC) is a task that alters the voice of a person to suit different styles while conserving the linguistic content. Previous state-of-the-art technology used in VC was based on the sequence-to-sequence (seq2seq) model, which could lose linguistic information. There was an attempt to overcome this problem using textual supervision; however, this required explicit alignment, and therefore the benefit of using seq2seq model was lost. In this study, a voice converter that utilizes multitask learning with text-to-speech (TTS) is presented. By using multitask learning, VC is expected to capture linguistic information and preserve the training stability. This method does not require explicit alignment for capturing abundant text information. Experiments on VC were performed on a male-Korean-emotional-text-speech dataset to convert the neutral voice to emotional voice. It was shown that multitask learning helps to preserve the linguistic content.
机译:语音转换(VC)是一项任务,可在保留语言内容的同时更改人的语音以适应不同的风格。 VC中使用的最新技术基于序列到序列(seq2seq)模型,该模型可能会丢失语言信息。人们试图通过文本监督来克服这个问题。但是,这需要明确的对齐方式,因此失去了使用seq2seq模型的好处。在这项研究中,提出了一种利用文本语音转换(TTS)的多任务学习的语音转换器。通过使用多任务学习,VC有望捕获语言信息并保持训练的稳定性。此方法不需要显式对齐即可捕获大量文本信息。在男性韩语情感文本语音数据集上进行了VC实验,将中性语音转换为情感语音。结果表明,多任务学习有助于保留语言内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号