首页> 外文会议>International conference on communications, signal processing, and systems >Audio Tagging With Connectionist Temporal Classification Model Using Sequentially Labelled Data
【24h】

Audio Tagging With Connectionist Temporal Classification Model Using Sequentially Labelled Data

机译:使用顺序标记数据使用依次标记的数据标记使用连接员时间分类模型标记

获取原文

摘要

Audio tagging aims to predict one or several labels in an audio clip. Many previous works use weakly labelled data (WLD) for audio tagging, where only presence or absence of sound events is known, but the order of sound events is unknown. To use the order information of sound events, we propose sequentially labelled data (SLD), where both the presence or absence and the order information of sound events are known. To utilize SLD in audio tagging, we propose a convolutional recurrent neural network followed by a connectionist temporal classification (CRNN-CTC) objective function to map from an audio clip spectrogram to SLD. Experiments show that CRNN-CTC obtains an area under curve (AUC) score of 0.986 in audio tagging, outperforming the baseline CRNN of 0.908 and 0.815 with max pooling and average pooling, respectively. In addition, we show CRNN-CTC has the ability to predict the order of sound events in an audio clip.
机译:音频标记旨在预测音频剪辑中的一个或多个标签。许多以前的作品使用弱标记的数据(WLD)进行音频标记,其中只知道声音事件的存在或不存在,但声音事件的顺序是未知的。要使用声音事件的订单信息,我们提出了顺序标记的数据(SLD),其中声音事件的存在或缺失和订单信息都是已知的。为了利用音频标记中的SLD,我们提出了一种卷积经常性神经网络,然后是从音频剪辑谱图到SLD的连接主人时间分类(CRNN-CTC)目标函数。实验表明,CRNN-CTC在音频标签中获得0.986的曲线(AUC)得分的区域,优于最大汇集和平均池的基线CRNN为0.908和0.815。此外,我们显示CRNN-CTC能够预测音频剪辑中的声音事件的顺序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号