首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features
【24h】

Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features

机译:使用具有连续小波变换F0功能的双重监督对抗网络进行情感语音转换

获取原文
获取原文并翻译 | 示例

摘要

In emotional voice conversion (VC) tasks, it is difficult to deal with a simple representation of fundamental frequency (F0), which is the most important feature in emotional voice representation. In order to address this issue, we propose the adaptive scales continuous wavelet transform (ADS-CWT) method to systematically capture F0 features of different temporal levels, which can represent different prosodic aspects, ranging from micro-prosody to sentences. Moreover, in an emotional VC task, each dataset is paired with the labeled emotional voice and neutral voice, which can be regarded as a dual task. Owing to, first, dual supervised learning's ability to improve the training performances by using the leveraging probabilistic connection between the dual tasks to enhance the learning from labeled data and, second, generative adversarial networks' (GANs') ability to mitigate the over-smoothing problem caused in the low-level data space when converting the acoustic features, we further present a novel training framework for emotional VC using GANs combined with dual supervised learning, named as dual supervised adversarial networks. In emotional VC experiments, we confirmed the high similarity performance of our method when using limited labeled data for emotional VC. Our method achieves good and consistent performance, in both objective and subjective evaluations.
机译:在情感语音转换(VC)任务中,很难处理基本频率(F0)的简单表示,这是情感语音表示中最重要的功能。为了解决这个问题,我们提出了一种自适应尺度连续小波变换(ADS-CWT)方法,以系统地捕获不同时间水平的F0特征,这些特征可以代表从微韵律到句子的不同韵律方面。此外,在情感VC任务中,每个数据集都与标记的情感语音和中性语音配对,可以视为双重任务。首先,双重监督学习具有通过利用双重任务之间的概率联系来增强从标记数据中学习的能力,从而提高训练成绩的能力;其次,生成对抗网络(GAN)的能力可以缓解过度平滑的情况在转换声学特征时在低级数据空间中引起的问题,我们进一步提出了一种使用GAN与双重监督学习相结合的,用于情感VC的新颖训练框架,称为双重监督对抗网络。在情感VC实验中,我们证实了将有限的标记数据用于情感VC时,该方法具有很高的相似性。我们的方法在客观和主观评估方面均取得了良好且一致的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号