首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >A Two-Stage Approach to Device-Robust Acoustic Scene Classification
【24h】

A Two-Stage Approach to Device-Robust Acoustic Scene Classification

机译:一种双级方法,可实现稳健的声学场景分类

获取原文

摘要

To improve device robustness, a highly desirable key feature of a competitive data-driven acoustic scene classification (ASC) system, a novel two-stage system based on fully convolutional neural networks (CNNs) is proposed. Our two-stage system leverages on an ad-hoc score combination based on two CNN classifiers: (i) the first CNN classifies acoustic inputs into one of three broad classes, and (ii) the second CNN classifies the same inputs into one of ten finergrained classes. Three different CNN architectures are explored to implement the two-stage classifiers, and a frequency sub-sampling scheme is investigated. Moreover, novel data augmentation schemes for ASC are also investigated. Evaluated on DCASE 2020 Task 1a, our results show that the proposed ASC system attains a state-of-the-art accuracy on the development set, where our best system, a two-stage fusion of CNN ensembles, delivers a 81.9% average accuracy among multi-device test data, and it obtains a significant improvement on unseen devices. Finally, neural saliency analysis with class activation mapping (CAM) gives new insights on the patterns learnt by our models.
机译:为了提高设备鲁棒性,提出了一种竞争数据驱动的声学场景分类(ASC)系统的高度理想的关键特征,这是基于完全卷积神经网络(CNNS)的新型两级系统。我们的两阶段系统利用基于两个CNN分类器的ad-hoc得分组合来利用:(i)第一个CNN将声学输入分类为三个广播中的一个,(ii)第二个CNN将相同的输入分类为十个Finergromed课程。探索了三种不同的CNN架构来实现两阶段分类器,并研究了频率子采样方案。此外,还研究了ASC的新型数据增强方案。在DCES 2020任务1A上进行评估,结果表明,建议的ASC系统对开发集进行了最先进的准确性,其中我们最好的系统,CNN集合的两级融合,平均精度为81.9%在多设备测试数据中,它获得了看不见的设备的显着改进。最后,具有类激活映射(CAM)的神经显着性分析为我们的模型学习的模式提供了新的见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号