DUAL-LAYER BAG-OF-FRAMES MODEL FOR MUSIC GENRE CLASSIFICATION

机译：用于音乐类型分类的双层框架模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper concerns the development of a music dictionary-based model for summarizing local feature descriptors computed over time. Comparing to a holistic representation, this text-like, bag-of-frames representation better captures the rich and time-varying information of music. However, the dictionary used in classical bag-of-frames model only captures frame-level elements of the music; thus, there exists a semantic gap between the dictionary element and commonly seen music description. In order to reduce the gap, a new feature representation called dual-layer bag-of-frames is proposed in this paper. It models the music with a two layer structure, where the first-layer dictionary captures the frame-level characteristics, and the second-layer dictionary captures the segment-level semantics. This hierarchical structure resembles the alphabet-word-document structure of text. Our result demonstrates that the proposed dual-layer bag-of-frames feature achieves state-of-the-art accuracy of music genre classification. The classification accuracy for the GTZAN benchmark reaches 86.7% with dictionary trained from GTZAN, and 83.6% with dictionary trained from another data set USPOP.

机译：本文涉及扩展随时间计算的本地特征描述符的基于音乐词典的模型的发展。比较与整体表示，这种形式，类似帧框架表示更好地捕获了音乐的丰富和时变的信息。但是，经典框架模型中使用的字典仅捕获音乐的帧级元素;因此，字典元素之间存在语义差距和常见的音乐描述。为了减小间隙，本文提出了一种名为双层框架的新特征表示。它用两层结构模拟音乐，其中第一层字典捕获帧级特征，第二层字典捕获段级语义。此分层结构类似于文本的字母表 - 字文档结构。我们的结果表明，所提出的双层框架特征达到了音乐类型分类的最先进准确性。 GTZAN基准的分类准确性达到86.7％，用GTZAN培训的字典达到86.7％，其中83.6％用另一个数据集中的字典培训。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2013年||共5页
会议地点
作者
Chin-Chia Michael Yeh; Li Su; Yi-Hsuan Yang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. Recognition and Classification Model of Music Genres and Chinese Traditional Musical Instruments Based on Deep Neural Networks [J] . Ke Xu Scientific programming . 2021,第a期

机译：基于深神经网络的音乐流派和中国传统乐器的认可与分类模型
2. Automatic Classification of Musical Genres Using Inter-Genre Similarity [J] . Bagci U., Erzin E. IEEE signal processing letters . 2007,第8期

机译：使用跨流派相似度对音乐流派进行自动分类
3. The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music [J] . Aucouturier JJ, Defreville B, Pachet F The Journal of the Acoustical Society of America . 2007,第2期

机译：基于包的方法进行音频模式识别：适用于城市声景的模型，但不适用于复音音乐
4. Dual-layer bag-of-frames model for music genre classification [C] . Yeh Chin-Chia Michael, Su Li, Yang Yi-Hsuan IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：用于音乐流派分类的双层框架袋模型
5. Efficient Music Genre Classification for Real-Time Recommender Systems. [D] . Jhaveri, Vyom S. 2017

机译：实时推荐系统的高效音乐体裁分类。
6. Towards the use of similarity distances to music genre classification: A comparative study [O] . Izaro Goienetxea, José María Martínez-Otzeta, Basilio Sierra, 2012

机译：运用相似距离进行音乐流派分类的比较研究
7. Inter Genre Similarity Modelling For Automatic Music Genre Classification [O] . Bagci, Ulas, Erzin, Engin 2009

机译：自动音乐流派的类型间相似性建模分类

DUAL-LAYER BAG-OF-FRAMES MODEL FOR MUSIC GENRE CLASSIFICATION

摘要

著录项

相似文献

相关主题

期刊订阅