...
首页> 外文期刊>Neurocomputing >Attention based convolutional recurrent neural network for environmental sound classification
【24h】

Attention based convolutional recurrent neural network for environmental sound classification

机译:基于注意的卷积复发性神经网络,用于环境声分类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Environmental sound classification (ESC) is a challenging problem due to the complexity of sounds. The classification performance is heavily dependent on the effectiveness of representative features extracted from the environmental sounds. However, ESC often suffers from the semantically irrelevant frames and silent frames. In order to deal with this, we employ a frame-level attention model to focus on the seman-tically relevant frames and salient frames. Specifically, we first propose a convolutional recurrent neural network to learn spectro-temporal features and temporal correlations. Then, we extend our convolutional RNN model with a frame-level attention mechanism to learn discriminative feature representations for ESC. We investigated the classification performance when using different attention scaling function and applying different layers. Experiments were conducted on ESC-50 and ESC-10 datasets. Experimental results demonstrated the effectiveness of the proposed method and our method achieved the state-of-the-art or competitive classification accuracy with lower computational complexity. We also visualized our attention results and observed that the proposed attention mechanism was able to lead the network tofocus on the semantically relevant parts of environmental sounds.(c) 2020 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
机译:由于声音的复杂性,环境声音分类(ESC)是一个具有挑战性的问题。分类性能严重依赖于从环境声音提取的代表性特征的有效性。然而,ESC经常遭受语义无关的框架和沉默框架。为了处理这一点,我们采用帧级注意模型,专注于半字的相关帧和突出框架。具体而言,我们首先提出了一种卷积经常性神经网络来学习光谱 - 时间特征和时间相关性。然后,我们通过帧级注意机制扩展我们的卷积RNN模型,以学习ESC的鉴别特征表示。我们在使用不同的关注缩放功能并应用不同的层时调查了分类性能。在ESC-50和ESC-10数据集上进行实验。实验结果表明了所提出的方法的有效性,我们的方法实现了较低的计算复杂性的最先进或竞争性分类准确性。我们还可以为我们的注意结果显现,并观察到所提出的注意机制能够在语义相关的环境声音上引导网络栓塞。(c)2020作者。由elsevier b.v发布。这是CC By-NC-ND许可证下的一个开放式访问文章(http://creativecommons.org/licenses/by-nc-nd/4.0/)。

著录项

  • 来源
    《Neurocomputing》 |2021年第17期|896-903|共8页
  • 作者单位

    Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China;

    Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China;

    Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China;

    Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China;

    Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Environmental sound classification; Convolutional recurrent neural network; Attention mechanism;

    机译:环境声音分类;卷积经常性神经网络;注意机制;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号