Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models

机译：使用适应歌唱的声学模型对和弦音乐进行自动歌词-音频对齐

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Lyrics-to-audio alignment is to automatically align the lyrical words with the mixed singing audio (singing voice+musical accompaniment). Such alignment can be achieved with an automatic speech recognition (ASR) system. We propose to adapt the acoustic model of a speech recognizer towards solo singing voice. This avoids the hurdles of annotating a large polyphonic music training dataset. Moreover, a lexicon-modification based duration modelling has been incorporated to account for the long duration vowels in singing. As practical application demand the alignment on polyphonic music, we study the effect of different singing vocal separation methods in the task of lyrics-to-audio alignment in polyphonic music. The extracted vocals are forced-aligned with the singing-adapted models. We demonstrate that the use of audio source separation method and effective end-pointing of the songs has a high impact on the alignment performance through the experiments. We report a mean average absolute error of 3.87 seconds, which is comparable with the state-of-the-art lyrics-to-audio alignment system that is trained on a large polyphonic music database.

机译：歌词到音频对齐是用混合歌唱音频（唱歌语音+音乐伴奏）自动对齐抒情单词。可以通过自动语音识别（ASR）系统实现这种对准。我们建议使语音识别器的声学模型进行适应独奏歌唱声音。这避免了注释大量音乐训练数据集的障碍。此外，已经纳入了基于词典修改的持续时间建模，以考虑歌唱的长期元音。作为实际应用需求对调解音乐的对准，我们研究了不同歌唱声分离方法在多关音乐中歌词到音频对齐任务的效果。提取的声带被强制与唱歌适应的模型对齐。我们证明使用音频源分离方法和有效的歌曲的有效端指向通过实验对对准性能产生高影响力。我们报告了3.87秒的平均平均绝对误差，这与在大型元音音乐数据库上培训的最先进的歌词到音频对齐系统相当。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2019年|396-400|共5页
会议地点
作者
Bidisha Sharma; Chitralekha Gupta; Haizhou Li; Ye Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Source separation; Hidden Markov models; Adaptation models; Data models; Music; Training;

机译：源分离;隐马尔可夫模型;适应模型;数据模型;音乐;训练;
入库时间 2022-08-26 14:45:59

相似文献

外文文献
中文文献
专利

1. Automatic Transcription of Polyphonic Piano Music Using Genetic Algorithms, Adaptive Spectral Envelope Modeling, and Dynamic Noise Level Estimation [J] . Reis G., Fernandez de Vega F., Ferreira A. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第8期

机译：使用遗传算法，自适应频谱包络模型和动态噪声水平估计自动复音钢琴音乐的转录
2. Automatic Encoding of Polyphonic Melodies in Musicians and Nonmusicians [J] . Takako Fujioka, Laurel J. Trainor, Bernhard Ross, Journal of Cognitive Neuroscience . 2005,第10期

机译：音乐家和非音乐家中和弦旋律的自动编码
3. Towards Automatic Music Transcription: Extraction of MIDI-Data out of Polyphonic Piano Music [J] . Jens Wellhausen Journal of Systemics, Cybernetics and Informatics . 2005,第3期

机译：迈向自动音乐转录：从和弦钢琴音乐中提取MIDI数据
4. Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models [C] . Bidisha Sharma, Chitralekha Gupta, Haizhou Li, IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：使用唱歌适应声学模型对多关音乐的自动歌词对齐
5. Neural Networks for Automatic Polyphonic Piano Music Transcription [D] . Ender, Johnathon Michael. 2018

机译：自动复音钢琴音乐转录的神经网络
6. Using forced alignment for automatic acoustic-phonetic segmentation of aphasic discourse [O] . Alice Lee, Anthony Pak Hin Kong, Sam-Po Law -1

机译：使用强制对准了失语症话语自动声语音分割
7. Acoustic Modeling for Automatic Lyrics-to-Audio Alignment [O] . Chitralekha Gupta, Emre Yılmaz, Haizhou Li 2019

机译：用于自动歌词到音频对齐的声学建模

Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models

摘要

著录项

相似文献

相关主题

期刊订阅