首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Recurrent neural networks for polyphonic sound event detection in real life recordings
【24h】

Recurrent neural networks for polyphonic sound event detection in real life recordings

机译:循环神经网络用于现实录音中的复音声音事件检测

获取原文

摘要

In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single multilabel BLSTM RNN is trained to map acoustic features of a mixture signal consisting of sounds from multiple classes, to binary activity indicators of each event class. Our method is tested on a large database of real-life recordings, with 61 classes (e.g. music, car, speech) from 10 different everyday contexts. The proposed method outperforms previous approaches by a large margin, and the results are further improved using data augmentation techniques. Overall, our system reports an average F 1-score of 65.5% on 1 second blocks and 64.7% on single frames, a relative improvement over previous state-of-the-art approach of 6.8% and 15.1% respectively.
机译:在本文中,我们提出了一种基于双向长期短期记忆(BLSTM)递归神经网络(RNN)的现实生活录音中的复音声音事件检测方法。训练了一个多标签BLSTM RNN,将混合信号的声学特征(包括来自多个类别的声音)映射到每个事件类别的二进制活动指示符。我们的方法已在一个大型的真实录音数据库中进行了测试,该数据库有来自10种不同日常情况的61种课程(例如音乐,汽车,语音)。所提出的方法在很大程度上优于以前的方法,并且使用数据增强技术进一步改善了结果。总体而言,我们的系统报告的1秒数据块的平均F 1分数为65.5%,单帧图像的平均F 1分数为64.7%,比以前的最新方法分别提高了6.8%和15.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号