首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks
【24h】

Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks

机译:使用事件特定的高斯滤波器和完全卷积网络进行弱监督的音频事件检测

获取原文

摘要

Audio event detection aims at discovering the elements inside an audio clip. In addition to labeling the clips with the audio events, we want to find out the temporal locations of these events. However, creating clearly annotated training data can be time-consuming. Therefore, we provide a model based on convolutional neural networks that relies only on weakly-supervised data for training. These data can be directly obtained from online platforms, such as Freesound, with the clip-level labels assigned by the uploaders. The structure of our model is extended to a fully convolutional networks, and an event-specific Gaussian filter layer is designed to advance its learning ability. Besides, this model is able to detect frame-level information, e.g., the temporal position of sounds, even when it is trained merely with clip-level labels.
机译:音频事件检测旨在发现音频剪辑中的元素。除了用音频事件标记剪辑外,我们还想找出这些事件的时间位置。但是,创建带有注释的训练数据非常耗时。因此,我们提供了基于卷积神经网络的模型,该模型仅依赖于弱监督数据进行训练。这些数据可以直接从在线平台(例如Freesound)获取,并具有上传者分配的剪辑级别标签。我们模型的结构扩展到一个完全卷积的网络,并设计了一个特定于事件的高斯滤波器层来提高其学习能力。此外,即使仅使用剪辑级标签训练该模型,该模型也能够检测帧级信息,例如声音的时间位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号