...
首页> 外文期刊>Applied Acoustics >Robust acoustic event recognition using AVMD-PWVD time-frequency image
【24h】

Robust acoustic event recognition using AVMD-PWVD time-frequency image

机译:使用AVMD-PWVD时频图像强大的声学事件识别

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Environmental sound feature extraction and classification are important signal analysis tools in many applications, such as audio surveillance, multimedia retrieval, and auditory source identification. However, the non-stationarity and discontinuity of environmental signals make quantification and classification a formidable challenge. Hence, researchers proposed to use the time-frequency image representation to quantify these non-stationarity, resulting in higher classification accuracy. In this paper, a time-frequency representation method is proposed to represent environmental sound signals. Our approach consists of three stages: Firstly, we propose an adaptive variational modal decomposition (AVMD) based on central angular frequency difference to decompose environmental sounds into a series of modes. Secondly, we use the pseudo Wigner-Vile distribution (PWVD) to accurately obtain the instantaneous frequency characteristics of mode signals. Thirdly, time-frequency images of sound signals are obtained by combining the mode signals with PWVD. Finally, we put the time-frequency image into a convolutional neural network (CNN) for classification. The method is tested on the Real World Computing Partnership (RWCP) Sound Scene Database of 50 classes in mismatched conditions. Results show that our method is robust to noise and achieves the best average recognition accuracy compared with several state-of-art methods under clean and various noisy conditions. (C) 2021 Elsevier Ltd. All rights reserved.
机译:环境声音特征提取和分类是许多应用中的重要信号分析工具,例如音频监控,多媒体检索和听觉源识别。然而,环境信号的非公平性和不连续性使得量化和分类成为一个强大的挑战。因此,提出的研究人员使用时频图像表示来量化这些非平稳性,从而导致更高的分类精度。在本文中,提出了一种时频表示方法来表示环境声音信号。我们的方法由三个阶段组成:首先,我们提出了一种基于中央角频率差的自适应变分模式分解(AVMD),以将环境声音分解为一系列模式。其次,我们使用伪Wigner-vile分布(PWVD)精确地获得模式信号的瞬时频率特性。第三,通过将模式信号与PWVD组合来获得声音信号的时频图像。最后,我们将时频图像放入卷积神经网络(CNN)中进行分类。该方法在More World Computing Partnership(RWCP)声音场景数据库上进行了测试,其中50个类别中的不匹配条件。结果表明,我们的方法对噪声稳健,与清洁和各种嘈杂条件下的几种最先进的方法相比,实现了最佳的平均识别精度。 (c)2021 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号