首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Weakly Labelled AudioSet Tagging With Attention Neural Networks
【24h】

Weakly Labelled AudioSet Tagging With Attention Neural Networks

机译:带有注意力神经网络的弱标记音频集标记

获取原文
获取原文并翻译 | 示例

摘要

Audio tagging is the task of predicting the presence or absence of sound classes within an audio clip. Previous work in audio tagging focused on relatively small datasets limited to recognizing a small number of sound classes. We investigate audio tagging on AudioSet, which is a dataset consisting of over 2 million audio clips and 527 classes. AudioSet is weakly labelled, in that only the presence or absence of sound classes is known for each clip, whereas the onset and offset times are unknown. To address the weakly labelled audio tagging problem, we propose attention neural networks as a way to attend the most salient parts of an audio clip. We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging. We investigate attention neural networks modeled by different functions, depths, and widths. Experiments on AudioSet show that the feature-level attention neural network achieves a state-of-the-art mean average precision of 0.369, outperforming the best MIL method of 0.317 and Google's deep neural network baseline of 0.314. In addition, we discover that the audio tagging performance on AudioSet-embedding features has a weak correlation with the number of training samples and the quality of labels of each sound class.
机译:音频标记是预测音频片段中是否存在声音类别的任务。音频标记的先前工作集中于相对较小的数据集,仅限于识别少量的声音类别。我们研究了AudioSet上的音频标记,AudioSet是一个由200万个音频剪辑和527个类别组成的数据集。 AudioSet的标签较弱,因为每个剪辑仅知道声音类别的存在与否,而开始和偏移时间未知。为了解决标记较弱的音频标记问题,我们提出了注意力神经网络,作为关注音频片段最重要部分的一种方式。我们桥接注意力神经网络和多实例学习(MIL)方法之间的联系,并提出用于音频标记的决策级和特征级注意力神经网络。我们研究通过不同功能,深度和宽度建模的注意力神经网络。在AudioSet上进行的实验表明,特征级注意力神经网络的平均平均精度为0.369,优于最佳MIL方法的0.317和Google深层神经网络基准的0.314。此外,我们发现AudioSet嵌入功能上的音频标记性能与训练样本的数量和每个声音类别的标签质量之间的相关性较弱。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号