首页> 外文会议>Pacific-Rim conference on multimedia >Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features

【24h】

Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features

机译：多时相分辨率卷积神经网络结合多层次特征的环境声分类

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Motivated by the fact that characteristics of different sound classes are highly diverse in different temporal scales and hierarchical levels, a novel deep convolutional neural network (CNN) architecture is proposed for the environmental sound classification task. This network architecture takes raw waveforms as input, and a set of separated parallel CNNs are utilized with different convolutional filter sizes and strides, in order to learn feature representations with multi-temporal resolutions. On the other hand, the proposed architecture also aggregates hierarchical features from multi-level CNN layers for classification using direct connections between convolutional layers, which is beyond the typical single-level CNN features employed by the majority of previous studies. This network architecture also improves the flow of information and avoids vanishing gradient problem. The combination of multi-level features boosts the classification performance significantly. Comparative experiments are conducted on two datasets: the environmental sound classification dataset (ESC-50), and DCASE 2017 audio scene classification dataset. Results demonstrate that the proposed method is highly effective in the classification tasks by employing multi-temporal resolution and multi-level features, and it outperforms the previous methods which only account for single-level features.

机译：由于不同声音类别的特性在不同的时间尺度和等级层次上具有高度差异的事实，因此提出了一种新颖的深度卷积神经网络（CNN）体系结构来进行环境声音分类任务。该网络体系结构将原始波形作为输入，并使用具有不同卷积滤波器大小和步幅的一组分离的并行CNN，以学习具有多时间分辨率的特征表示。另一方面，提出的体系结构还使用卷积层之间的直接连接聚合了来自多层CNN层的分层特征以进行分类，这超出了大多数先前研究所采用的典型单层CNN特征。这种网络体系结构还改善了信息流，并避免了梯度问题的消失。多级功能的组合大大提高了分类性能。在两个数据集上进行了对比实验：环境声音分类数据集（ESC-50）和DCASE 2017音频场景分类数据集。结果表明，该方法通过采用多时间分辨率和多级特征，在分类任务中非常有效，并且优于仅考虑单级特征的方法。

著录项

来源
《Pacific-Rim conference on multimedia》|2018年|528-537|共10页
会议地点
作者
Boqing Zhu; Kele Xu; Dezhi Wang; Lilun Zhang; Bo Li; Yuxing Peng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Audio scene classification; Multi-temporal resolution Multi-level; Convolutional neural network;

机译：音频场景分类;多时间分辨率多级;卷积神经网络;

相似文献

外文文献
中文文献
专利

1. An attention-driven convolutional neural network-based multi-level spectral-spatial feature learning for hyperspectral image classification [J] . Pu Chunyu, Huang Hong, Yang Liping Expert systems with applications . 2021,第Deca期

机译：基于关注驱动的卷积神经网络的高光谱图像分类的基于卷积神经网络的多级光谱空间特征学习
2. Heart sound classification based on log Mel-frequency spectral coefficients features and convolutional neural networks [J] . Kui Haoran, Pan Jiahua, Zong Rong, Biomedical signal processing and control . 2021,第Auga期

机译：基于日志熔体频谱系数特征和卷积神经网络的心声分类
3. Attention based convolutional recurrent neural network for environmental sound classification [J] . Zhang Zhichao, Xu Shugong, Zhang Shunqing, Neurocomputing . 2021,第Sepa17期

机译：基于注意的卷积复发性神经网络，用于环境声分类
4. Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features [C] . Boqing Zhu, Kele Xu, Dezhi Wang, Pacific-Rim Conference on Multimedia . 2018

机译：基于多时间分辨率卷积神经网络与多级别特征相结合的环境声音分类
5. Investigation of Convolutional Neural Network Architectures for Image-based Feature Learning and Classification. [D] . Ren, Johnny. 2016

机译：基于图像的特征学习和分类的卷积神经网络体系结构研究。
6. An Adoptive Threshold-Based Multi-Level Deep Convolutional Neural Network for Glaucoma Eye Disease Detection and Classification [O] . Muhammad Aamir, Muhammad Irfan, Tariq Ali, 2020

机译：基于阈值的基于阈值的多级深度卷积神经网络用于青光眼眼病检测和分类
7. Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction [O] . Yousef Abd Al-Hattab, Hasan Firdaus Zaki, Amir Akramin Shafie 2021

机译：使用卷积神经网络重新思考环境声音分类：单个特征提取的优化参数调整
8. Keypoint Density-Based Region Proposal for Fine-Grained Object Detection and Classification Using Regions with Convolutional Neural Network Features. [R] . Turner, J. T., Gupta, K., Morris, B., 2015

机译：基于关键点密度的区域提议，用于使用具有卷积神经网络特征的区域进行细粒度目标检测和分类。

Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅