...
首页> 外文期刊>Multimedia Tools and Applications >Spectrogram based multi-task audio classification
【24h】

Spectrogram based multi-task audio classification

机译:基于频谱图的多任务音频分类

获取原文
获取原文并翻译 | 示例

摘要

Audio classification is regarded as a great challenge in pattern recognition. Although audio classification tasks are always treated as independent tasks, tasks are essentially related to each other such as speakers' accent and speakers' identification. In this paper, we propose a Deep Neural Network (DNN)-based multi-task model that exploits such relationships and deals with multiple audio classification tasks simultaneously. We term our model as the gated Residual Networks (GResNets) model since it integrates Deep Residual Networks (ResNets) with a gate mechanism, which extract better representations between tasks compared with Convolutional Neural Networks (CNNs). Specifically, two multiplied convolutional layers are used to replace two feed-forward convolution layers in the ResNets. We tested our model on multiple audio classification tasks and found that our multi-task model achieves higher accuracy than task-specific models which train the models separately.
机译:音频分类被认为是模式识别中的巨大挑战。尽管音频分类任务始终被视为独立任务,但是任务本质上是彼此相关的,例如说话者的口音和说话者的识别。在本文中,我们提出了一种基于深度神经网络(DNN)的多任务模型,该模型利用了这种关系并同时处理多个音频分类任务。我们将我们的模型称为门控残差网络(GResNets)模型,因为它将深层残差网络(ResNets)与门机制集成在一起,与卷积神经网络(CNN)相比,它可以更好地提取任务之间的表示。具体而言,两个乘积卷积层用于替换ResNets中的两个前馈卷积层。我们在多个音频分类任务上测试了我们的模型,发现我们的多任务模型比单独训练模型的特定于任务的模型具有更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号