首页> 外文会议>Conference on Multimedia Information Processing and Retrieval >Singing Voice Conversion with Non-parallel Data
【24h】

Singing Voice Conversion with Non-parallel Data

机译:非并行数据的歌声转换

获取原文

摘要

Singing voice conversion is a task to convert a song sang by a source singer to the voice of a target singer. In this paper, we propose using a parallel data free, many-to-one voice conversion technique on singing voices. A phonetic posterior feature is first generated by decoding singing voices through a robust Automatic Speech Recognition Engine (ASR). Then, a trained Recurrent Neural Network (RNN) with a Deep Bidirectional Long Short Term Memory (DBLSTM) structure is used to model the mapping from person-independent content to the acoustic features of the target person. F0 and aperiodic are obtained through the original singing voice, and used with acoustic features to reconstruct the target singing voice through a vocoder. In the obtained singing voice, the targeted and sourced singers sound similar. To our knowledge, this is the first study that uses non parallel data to train a singing voice conversion system. Subjective evaluations demonstrate that the proposed method effectively converts singing voices.
机译:唱歌语音转换是将源歌手演唱的歌曲转换为目标歌手的语音的任务。在本文中,我们建议对歌声使用无并行数据,多对一语音转换技术。首先通过强大的自动语音识别引擎(ASR)对歌声进行解码来生成语音后部特征。然后,使用具有深度双向长期短期记忆(DBLSTM)结构的经过训练的递归神经网络(RNN)来建模从独立于人的内容到目标人的声学特征的映射。 F0和非周期性声音是通过原始歌声获得的,并与声学特征配合使用,以通过声码器重建目标歌声。在获得的歌声中,目标和源歌手听起来相似。据我们所知,这是第一项使用非并行数据来训练歌声转换系统的研究。主观评估表明,所提出的方法可以有效地转换歌声。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号