An End-to-End Neural Network for Polyphonic Piano Music Transcription

S. Sigtia; E. Benetos; S. Dixon

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >An End-to-End Neural Network for Polyphonic Piano Music Transcription

【24h】

An End-to-End Neural Network for Polyphonic Piano Music Transcription

机译：复音钢琴音乐转录的端到端神经网络

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a supervised neural network model for polyphonic piano music transcription. The architecture of the proposed model is analogous to speech recognition systems and comprises an and a . The acoustic model is a neural network used for estimating the probabilities of pitches in a frame of audio. The language model is a recurrent neural network that models the correlations between pitch combinations over time. The proposed model is general and can be used to transcribe polyphonic music without imposing any constraints on the polyphony. The acoustic and language model predictions are combined using a probabilistic graphical model. Inference over the output variables is performed using the beam search algorithm. We perform two sets of experiments. We investigate various neural network architectures for the acoustic models and also investigate the effect of combining acoustic and music language model predictions using the proposed architecture. We compare performance of the neural network-based acoustic models with two popular unsupervised acoustic models. Results show that convolutional neural network acoustic models yield the best performance across all evaluation metrics. We also observe improved performance with the application of the music language models. Finally, we present an efficient variant of beam search that improves performance and reduces run-times by an order of magnitude, making the model suitable for real-time applications.

机译：我们提出了用于复音钢琴音乐转录的监督神经网络模型。所提出的模型的体系结构类似于语音识别系统，并包括和。声学模型是用于估计音频帧中音高概率的神经网络。语言模型是一个递归神经网络，可对音高组合之间随时间的相关性进行建模。所提出的模型是通用的，可用于转录复音音乐而不会对复音施加任何约束。声音和语言模型的预测使用概率图形模型进行组合。使用波束搜索算法对输出变量进行推断。我们执行两组实验。我们研究了用于声学模型的各种神经网络体系结构，还研究了使用所提出的体系结构结合声学和音乐语言模型预测的效果。我们将基于神经网络的声学模型的性能与两个流行的无监督声学模型进行了比较。结果表明，卷积神经网络声学模型在所有评估指标上均表现出最佳性能。我们还观察到音乐语言模型的应用提高了性能。最后，我们提出了一种有效的波束搜索变体，可以提高性能并将运行时间减少一个数量级，从而使该模型适合于实时应用。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2016年第5期|927-939|共13页
作者
S. Sigtia; E. Benetos; S. Dixon;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic Music Transcription,; Automatic music transcription; Deep Learning; Music Language Models; Recurrent Neural Networks; deep learning; music language models; recurrent neural networks;

机译：自动音乐转录;自动音乐转录;深度学习;音乐语言模型;递归神经网络;深度学习;音乐语言模型;递归神经网络;

相似文献

外文文献
中文文献
专利

1. Towards Automatic Music Transcription: Extraction of MIDI-Data out of Polyphonic Piano Music [J] . Jens Wellhausen Journal of Systemics, Cybernetics and Informatics . 2005,第3期

机译：迈向自动音乐转录：从和弦钢琴音乐中提取MIDI数据
2. Polyphonic Piano Transcription with a Note-Based Music Language Model [J] . Qi Wang, Ruohua Zhou, Yonghong Yan Applied Sciences . 2018,第3期

机译：基于音符的音乐语言模型的复音钢琴转录
3. Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices [J] . Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第4期

机译：基于合并输出HMM的多音色和弦钢琴音乐节奏转录
4. Transcription of polyphonic piano music with neural networks [C] . Marolt, M. Electrotechnical Conference, 2000. MELECON 2000. 10th Mediterranean . 2000

机译：用神经网络转录和弦钢琴音乐
5. Neural Networks for Automatic Polyphonic Piano Music Transcription [D] . Ender, Johnathon Michael. 2018

机译：自动复音钢琴音乐转录的神经网络
6. Categorisation of polyphonic musical signals by using modularity community detection in audio-associated visibility network [O] . Dirceu de Freitas Piedade Melo, Inacio de Sousa Fadigas, Hernane Borges de Barros Pereira -1

机译：通过音频相关可见性网络中的模块化社区检测对复音音乐信号进行分类
7. An End-to-End Neural Network for Polyphonic Piano Music Transcription [O] . Sigtia, Siddharth, Benetos, Emmanouil, Dixon, Simon 2016

机译：复调钢琴音乐转录的端到端神经网络

An End-to-End Neural Network for Polyphonic Piano Music Transcription

摘要

著录项

相似文献

相关主题

期刊订阅