首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >Music chord recognition from audio data using bidirectional encoder-decoder LSTMs
【24h】

Music chord recognition from audio data using bidirectional encoder-decoder LSTMs

机译:使用双向编码器-解码器LSTM从音频数据中识别音乐和弦

获取原文

摘要

In this paper, we discuss some methods for chord recognition based on long short-term memory recurrent neural networks (LSTM, LSTM-RNN). Chord progressions play an important role in the generation process of music. Actually, music processing systems containing a model for chord progressions achieve high accuracies in tasks like music structure analysis, multi pitch analysis an automatic composition or accompaniment. In previous research, chord progressions were obtained rule- based or have been modeled using stochastic methods like hidden Markov models or probabilistic context-free grammars. Pitch patterns were then regarded as the observations resulting from the hidden states of the chord progression model. Recently, con- volutional neural networks have been used for chord recognition with considerable success. On the other hand, LSTM networks have been shown to be suitable for generating chord progressions, since these neural networks can process time series data very well. The purpose of this study is to evaluate and compare three types of LSTM networks based on the bidirectional and encoderdecoder structure with regards to their chord recognition performance. In order to extract more effective data for chord recognition, we use a constant-Q transform and specmurt analysis to suppress overtone components, and chroma vectorization to reduce the feature dimensionality. The evaluation results show that the encoder-decoder-based LSTM can learn the relationship between the observed chroma vectors and the associated chord progression more effectively than simpler LSTM networks.
机译:在本文中,我们讨论了一些基于长短期记忆递归神经网络(LSTM,LSTM-RNN)的和弦识别方法。和弦进行过程在音乐的生成过程中起着重要作用。实际上,包含和弦进行模型的音乐处理系统在诸如音乐结构分析,多音高分析,自动作曲或伴奏等任务中实现了很高的准确性。在以前的研究中,和弦进程是基于规则获得的,或者已使用诸如隐马尔可夫模型或概率性上下文无关文法之类的随机方法进行了建模。然后将音高模式视为由和弦进行模型的隐藏状态得出的观察结果。最近,卷积神经网络已用于和弦识别,并取得了相当大的成功。另一方面,已证明LSTM网络适合生成和弦进行,因为这些神经网络可以很好地处理时间序列数据。这项研究的目的是评估和比较基于双向和编码器-解码器结构的三种LSTM网络的和弦识别性能。为了提取更有效的和弦数据,我们使用了恒定Q变换和斑点分析来抑制泛音成分,并使用色度矢量化来降低特征维数。评估结果表明,与简单的LSTM网络相比,基于编码器-解码器的LSTM可以更有效地了解观察到的色度矢量与相关和弦进行之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号