首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >MPop600: A Mandarin Popular Song Database with Aligned Audio, Lyrics, and Musical Scores for Singing Voice Synthesis
【24h】

MPop600: A Mandarin Popular Song Database with Aligned Audio, Lyrics, and Musical Scores for Singing Voice Synthesis

机译:MPOP600:一个普通话流行歌曲数据库,具有对齐的音频,歌词和唱歌语音合成的音乐评分

获取原文

摘要

The purpose of singing voice synthesis (SVS) is to generate human-like singing voice from lyrics and the corresponding musical score. Nowadays, mainstream SVS approaches rely on neural networks (NNs) which can map linguistic and musical contextual factors to acoustic features for producing audio outputs. For SVS in Mandarin or other Chinese languages in particular, a sufficiently large and adequately labeled database has not been publicly available. To proceed with Mandarin SVS research, we built a singing voice database from scratch, with 600 pop songs sung by 2 male and 2 female vocalists. Each audio contains single vocal only, without any background music. This paper describes the recording of the dataset and necessary steps of data preprocessing for training NNs to perform SVS. Several simple neural network architectures were adopted so preliminary SVS performance can be compared. Both subjective and objective evaluations show that these networks could learn from the MPop600 database to generate singing voice with unseen musical scores. MPop600 is available in both the MIDI and the MusicXML formats. In the future, we believe that more advanced and recently developed networks can be applied to model the singing behaviors in this database and help advance research in Mandarin SVS.
机译:唱歌语音合成(SVS)的目的是从歌词和相应的音乐分数产生人类的歌声。如今,主流SVS方法依赖于神经网络(NNS),其可以将语言和音乐语境因素映射到用于产生音频输出的声学特征。对于普通话或其他中文语言的SV,特别是,足够大型和充分标记的数据库尚未公开可用。要继续进行普通话SVS研究,我们从头开始建立了一个歌唱语音数据库,其中600首男性和2名女歌手唱歌。每个音频都包含单个声乐,没有任何背景音乐。本文介绍了数据集的记录,以及用于训练NNS执行SV的数据预处理的必要步骤。采用了几种简单的神经网络架构,因此可以比较初步SVS性能。主观和客观的评估都表明,这些网络可以从MPOP600数据库中学到,以产生具有看不见的音乐评分的歌声。 MPOP600可在MIDI和MusicXML格式中提供。在未来,我们认为,可以应用更先进的和最近开发的网络来模拟该数据库中的歌唱行为,并帮助推进普通话SVS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号