Spectrogram based multi-task audio classification

Zeng Yuni; Mao Hua; Peng Dezhong; Yi Zhang

首页> 外文期刊>Multimedia Tools and Applications >Spectrogram based multi-task audio classification

【24h】

Spectrogram based multi-task audio classification

机译：基于频谱图的多任务音频分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Audio classification is regarded as a great challenge in pattern recognition. Although audio classification tasks are always treated as independent tasks, tasks are essentially related to each other such as speakers' accent and speakers' identification. In this paper, we propose a Deep Neural Network (DNN)-based multi-task model that exploits such relationships and deals with multiple audio classification tasks simultaneously. We term our model as the gated Residual Networks (GResNets) model since it integrates Deep Residual Networks (ResNets) with a gate mechanism, which extract better representations between tasks compared with Convolutional Neural Networks (CNNs). Specifically, two multiplied convolutional layers are used to replace two feed-forward convolution layers in the ResNets. We tested our model on multiple audio classification tasks and found that our multi-task model achieves higher accuracy than task-specific models which train the models separately.

机译：音频分类被认为是模式识别中的巨大挑战。尽管音频分类任务始终被视为独立任务，但是任务本质上是彼此相关的，例如说话者的口音和说话者的识别。在本文中，我们提出了一种基于深度神经网络（DNN）的多任务模型，该模型利用了这种关系并同时处理多个音频分类任务。我们将我们的模型称为门控残差网络（GResNets）模型，因为它将深层残差网络（ResNets）与门机制集成在一起，与卷积神经网络（CNN）相比，它可以更好地提取任务之间的表示。具体而言，两个乘积卷积层用于替换ResNets中的两个前馈卷积层。我们在多个音频分类任务上测试了我们的模型，发现我们的多任务模型比单独训练模型的特定于任务的模型具有更高的准确性。

著录项

来源
《Multimedia Tools and Applications 》 |2019年第3期| 3705-3722| 共18页
作者
Zeng Yuni; Mao Hua; Peng Dezhong; Yi Zhang;
展开▼
作者单位

Sichuan Univ, Coll Comp Sci, Machine Intelligence Lab, Chengdu 610065, Sichuan, Peoples R China;

Sichuan Univ, Coll Comp Sci, Machine Intelligence Lab, Chengdu 610065, Sichuan, Peoples R China;

Sichuan Univ, Coll Comp Sci, Machine Intelligence Lab, Chengdu 610065, Sichuan, Peoples R China;

Sichuan Univ, Coll Comp Sci, Machine Intelligence Lab, Chengdu 610065, Sichuan, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Multi-task learning; Convolutional neural networks; Deep residual networks; Audio classification;

机译：多任务学习;卷积神经网络;深度残差网络;音频分类;

相似文献

外文文献
中文文献
专利

1. Extraction of MapReduce-based features from spectrograms for audio-based surveillance [J] . Mulimani Manjunath, Koolagudi Shashidhar G. Digital Signal Processing . 2019 ,第期

机译：基于音频监测的谱图的基于谱图的基于映射的特征提取
2. A spectrogram-based audio fingerprinting system for content-based copy detection [J] . Ouali Chahid, Dumouchel Pierre, Gupta Vishwa Multimedia Tools and Applications . 2016 ,第15期

机译：基于频谱图的音频指纹识别系统，用于基于内容的副本检测
3. Multi-task hidden Markov modeling of spectrogram feature from radar high-resolution range profiles [J] . Mian Pan, Lan Du, Penghui Wang, EURASIP journal on advances in signal processing . 2012 ,第1期

机译：雷达高分辨率测距剖面的多任务隐马尔可夫谱图特征建模
4. Spectrogram-based audio classification of nutrition intake [C] . Kalantarian Haik, Alshurafa Nabil, Pourhomayoun Mohammad, 2014 IEEE Healthcare Innovation Conference . 2014

机译：基于频谱图的营养摄入音频分类
5. Real time implementation of audio spectrogram on field programmable gate array (fpga). [D] . Hebbal, Akshay KrishneGowda. 2014

机译：音频频谱图在现场可编程门阵列（fpga）上的实时实现。
6. Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies [O] . Minhaj Alam, David Le, Jennifer I. Lim, 2019

机译：基于监督机器学习的视网膜病变多任务人工智能分类
7. Spectrogram based multi-task audio classification [O] . Yuni Zeng, Hua Mao, Dezhong Peng, 2017

机译：基于频谱图的多任务音频分类

Spectrogram based multi-task audio classification

摘要

著录项

相似文献

相关主题

期刊订阅