首页> 外文会议>International Joint Conference on Neural Networks >Deep Feature Embedding and Hierarchical Classification for Audio Scene Classification
【24h】

Deep Feature Embedding and Hierarchical Classification for Audio Scene Classification

机译:音频场景分类的深度特征嵌入和层次分类

获取原文
获取外文期刊封面目录资料

摘要

In this work, we propose an approach that features deep feature embedding learning and hierarchical classification with triplet loss function for Acoustic Scene Classification (ASC). In the one hand, a deep convolutional neural network is firstly trained to learn a feature embedding from scene audio signals. Via the trained convolutional neural network, the learned embedding embeds an input into the embedding feature space and transforms it into a high-level feature vector for representation. In the other hand, in order to exploit the structure of the scene categories, the original scene classification problem is structured into a hierarchy where similar categories are grouped into meta-categories. Then, hierarchical classification is accomplished using deep neural network classifiers associated with triplet loss function. Our experiments show that the proposed system achieves good performance on both the DCASE 2018 Task 1A and 1B datasets, resulting in accuracy gains of 15.6% and 16.6% absolute over the DCASE 2018 baseline on Task 1A and 1B, respectively.
机译:在这项工作中,我们提出了一种以深度场景嵌入学习和具有三重损失函数的层次分类为特征的声学场景分类(ASC)方法。一方面,首先训练深度卷积神经网络,以从场景音频信号中学习嵌入的特征。通过训练的卷积神经网络,学习的嵌入将输入嵌入到嵌入特征空间中,并将其转换为高级特征向量以进行表示。另一方面,为了利用场景类别的结构,将原始场景分类问题构造为层次结构,其中将相似的类别分组为元类别。然后,使用与三元组损失函数关联的深度神经网络分类器完成分层分类。我们的实验表明,所提出的系统在DCASE 2018任务1A和1B数据集上均实现了良好的性能,相对于任务1A和1B的DCASE 2018基线,绝对精度分别提高了15.6%和16.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号