首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Emotional Voice Conversion Using Multitask Learning with Text-To-Speech
【24h】

Emotional Voice Conversion Using Multitask Learning with Text-To-Speech

机译:使用多任务学习与文本到语音的情感语音转换

获取原文

摘要

Voice conversion (VC) is a task that alters the voice of a person to suit different styles while conserving the linguistic content. Previous state-of-the-art technology used in VC was based on the sequence-to-sequence (seq2seq) model, which could lose linguistic information. There was an attempt to overcome this problem using textual supervision; however, this required explicit alignment, and therefore the benefit of using seq2seq model was lost. In this study, a voice converter that utilizes multitask learning with text-to-speech (TTS) is presented. By using multitask learning, VC is expected to capture linguistic information and preserve the training stability. This method does not require explicit alignment for capturing abundant text information. Experiments on VC were performed on a male-Korean-emotional-text-speech dataset to convert the neutral voice to emotional voice. It was shown that multitask learning helps to preserve the linguistic content.
机译:语音转换(VC)是一个任务,它改变了一个人的声音,以适应不同风格的同时节省语言内容。 在VC中使用的先前最先进的技术基于序列到序列(SEQ2SEQ)模型,这可能失去语言信息。 试图使用文本监督来克服这个问题; 但是,这需要明确的对齐,因此使用SEQ2SEQ模型的好处丢失了。 在本研究中,呈现了一种具有文本到语音(TTS)的多任务学习的语音转换器。 通过使用多任务学习,预计VC将捕获语言信息并保留培训稳定性。 此方法不需要显式对齐来捕获丰富的文本信息。 对VC的实验是对男性 - 朝鲜情绪文本语音数据集进行的,以将中性语音转换为情绪声音。 结果表明,多任务学习有助于保护语言内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号