...
首页> 外文期刊>Pattern recognition letters >SVD-based redundancy removal in 1-D CNNs for acoustic scene classification
【24h】

SVD-based redundancy removal in 1-D CNNs for acoustic scene classification

机译:基于SVD的冗余删除在1-D CNN中进行声学场景分类

获取原文
获取原文并翻译 | 示例
           

摘要

In this letter, we propose a concise feature representation framework for acoustic scene classification by pruning embeddings obtained from SoundNet, a deep convolutional neural network. We demonstrate that the feature maps generated at various layers of SoundNet have redundancy. The proposed singular value decomposition based method reduces the redundancy while relying on the assumption that useful feature maps produced by different classes lie along different directions. This leads to ignoring the feature maps that produce similar embeddings for different classes. In the context of using an ensemble of classifiers on the various layers of SoundNet, pruning the redundant feature maps leads to reduction in dimensionality and computational complexity. Our experiments on acoustic scene classification demonstrate that ignoring 73% of feature maps reduces the performance by less than 1% with 12.67% reduction in computational complexity. In addition to this, we also show that the proposed pruning framework can be utilized to remove filters in the SoundNet network architecture, with 13x lesser model storage requirement. Also, the number of parameters reduce from 28 million to 2 million with marginal degradation in performance. This small model obtained after applying the proposed pruning procedure is evaluated on different acoustic scene classification datasets, and shows excellent generalization ability. (c) 2020 Elsevier B.V. All rights reserved.
机译:在这封信中,我们向声学场景分类提出了一种简洁的特征表示框架,通过从SoundNet获得的嵌入,深度卷积神经网络。我们展示了在各种SoundNet中生成的特征映射具有冗余。所提出的奇异值分解的方法减少了冗余,同时依赖于假设不同类别的有用特征映射沿不同的方向。这导致忽略为不同类别产生类似嵌入的特征映射。在使用各种颜料层上的分类器的集分类的上下文中,修剪冗余特征映射导致减少维度和计算复杂度。我们对声学场景分类的实验表明,忽略了73%的特征贴图将性能降低了小于1%,计算复杂度降低了12.67%。除此之外,我们还表明,建议的修剪框架可用于在SoundNet网络架构中移除过滤器,具有13x较小的模型存储要求。此外,参数的数量从2800万到200万增加到200万到200万,性能边际降级。在应用所提出的修剪程序后获得的这种小型模型在不同的声学场景分类数据集上进行评估,并显示出优异的泛化能力。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号