MPop600: A Mandarin Popular Song Database with Aligned Audio, Lyrics, and Musical Scores for Singing Voice Synthesis

机译：MPOP600：一个普通话流行歌曲数据库，具有对齐的音频，歌词和唱歌语音合成的音乐评分

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The purpose of singing voice synthesis (SVS) is to generate human-like singing voice from lyrics and the corresponding musical score. Nowadays, mainstream SVS approaches rely on neural networks (NNs) which can map linguistic and musical contextual factors to acoustic features for producing audio outputs. For SVS in Mandarin or other Chinese languages in particular, a sufficiently large and adequately labeled database has not been publicly available. To proceed with Mandarin SVS research, we built a singing voice database from scratch, with 600 pop songs sung by 2 male and 2 female vocalists. Each audio contains single vocal only, without any background music. This paper describes the recording of the dataset and necessary steps of data preprocessing for training NNs to perform SVS. Several simple neural network architectures were adopted so preliminary SVS performance can be compared. Both subjective and objective evaluations show that these networks could learn from the MPop600 database to generate singing voice with unseen musical scores. MPop600 is available in both the MIDI and the MusicXML formats. In the future, we believe that more advanced and recently developed networks can be applied to model the singing behaviors in this database and help advance research in Mandarin SVS.

机译：唱歌语音合成（SVS）的目的是从歌词和相应的音乐分数产生人类的歌声。如今，主流SVS方法依赖于神经网络（NNS），其可以将语言和音乐语境因素映射到用于产生音频输出的声学特征。对于普通话或其他中文语言的SV，特别是，足够大型和充分标记的数据库尚未公开可用。要继续进行普通话SVS研究，我们从头开始建立了一个歌唱语音数据库，其中600首男性和2名女歌手唱歌。每个音频都包含单个声乐，没有任何背景音乐。本文介绍了数据集的记录，以及用于训练NNS执行SV的数据预处理的必要步骤。采用了几种简单的神经网络架构，因此可以比较初步SVS性能。主观和客观的评估都表明，这些网络可以从MPOP600数据库中学到，以产生具有看不见的音乐评分的歌声。 MPOP600可在MIDI和MusicXML格式中提供。在未来，我们认为，可以应用更先进的和最近开发的网络来模拟该数据库中的歌唱行为，并帮助推进普通话SVS。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference》|2020年|1647-1652|共6页
会议地点
作者
Chan-Chuan Chu; Fu-Rong Yang; Yi-Jhe Lee; Yi-Wen Liu; Shan-Hung Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Music; Databases; Feature extraction; Neurons; Hidden Markov models; Artificial neural networks; Vocoders;

机译：音乐;数据库;特征提取;神经元;隐藏的马尔可夫模型;人工神经网络;声探剂;
入库时间 2022-08-26 13:55:15

相似文献

外文文献
中文文献
专利

1. Can Genre Be "Heard" in Scale as Well as Song Tasks? An Exploratory Study of Female Singing in Western Lyric and Musical Theater Styles [J] . Kayes Gillyanne, Welch Graham F. Journal of voice: official journal of the Voice Foundation . 2017,第3期

机译：可以在规模和歌曲任务中“听到”类型吗？西部抒情和音乐剧型女性歌唱的探索性研究
2. Integration of a music generator and a song lyrics generator to create Spanish popular songs [J] . Navarro-Caceres Maria, Oliveira Hugo Goncalo, Martins Pedro, Journal of ambient intelligence and humanized computing . 2020,第11期

机译：音乐生成器和歌曲歌词生成器的集成，以创建西班牙语流行的歌曲
3. A HMM-based mandarin chinese singing voice synthesis system [J] . X. Li, Z. Wang Automatica Sinica, IEEE/CAA Journal of . 2016,第2期

机译：基于HMM的普通话中文语音合成系统。
4. Transcribing Lyrics from Commercial Song Audio: the First Step Towards Singing Content Processing [C] . Che-Ping Tsai, Yi-Lin Tuan, Lin-Shan Lee IEEE International Conference on Acoustics, Speech and Signal Processing . 2018

机译：从商用歌音频转录歌词：唱歌内容处理的第一步
5. "Heroic verse and sweet lyric song": George Frederic Handel's treatment of English literature in his musical drama. [D] . Eastman, Holly Ann. 1992

机译：“英雄诗句和甜美的抒情歌”：乔治·弗雷德里克·汉德尔在音乐剧中对英国文学的处理。
6. Tohoku Kiritan singing database: A singing database for statistical parametric singing synthesis using Japanese pop songs [O] . Itsuki Ogawa, Masanori Morise 2021

机译：Tohoku Kiritan唱歌数据库：使用日本流行歌曲的统计参数唱歌合成的歌唱数据库

MPop600: A Mandarin Popular Song Database with Aligned Audio, Lyrics, and Musical Scores for Singing Voice Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅