首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms
【24h】

Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms

机译:使用原始波形的音乐自动标记的示例级CNN架构

获取原文

摘要

Recent work has shown that the end-to-end approach using convolutional neural network (CNN) is effective in various types of machine learning tasks. For audio signals, the approach takes raw waveforms as input using an 1-D convolution layer. In this paper, we improve the 1-D CNN architecture for music auto-tagging by adopting building blocks from state-of-the-art image classification models, ResNets and SENets, and adding multi-level feature aggregation to it. We compare different combinations of the modules in building CNN architectures. The results show that they achieve significant improvements over previous state-of-the-art models on the MagnaTagATune dataset and comparable results on Million Song Dataset. Furthermore, we analyze and visualize our model to show how the 1-D CNN operates.
机译:最近的工作表明,使用卷积神经网络(CNN)的端到端方法在各种类型的机器学习任务中是有效的。对于音频信号,该方法采用原始波形作为使用1-D卷积层的输入。在本文中,我们通过采用最先进的图像分类模型,Resnet和Senet,以及向其添加多级别特征聚合来改进音乐自动标记的1-D CNN架构。我们比较模块在建立CNN架构中的不同组合。结果表明,它们对百万歌曲数据集上的先前最先进模型的显着改进,百万歌曲数据集。此外,我们分析和可视化我们的模型,以展示1-D CNN如何运行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号