首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection
【24h】

BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection

机译:BLSTM-HMM混合系统与音响活动检测网络相结合,用于复态声音事件检测

获取原文

摘要

This paper presents a new hybrid approach for polyphonic Sound Event Detection (SED) which incorporates a temporal structure modeling technique based on a hidden Markov model (HMM) with a frame-by-frame detection method based on a bidirectional long short-term memory (BLSTM) recurrent neural network (RNN). The proposed BLSTM-HMM hybrid system makes it possible to model sound event-dependent temporal structures and also to perform sequence-by-sequence detection without having to resort to thresholding such as in the conventional frame-by-frame methods. Furthermore, to effectively reduce insertion errors of sound events, which often occurs under noisy conditions, we additionally implement a binary mask post-processing using a sound activity detection (SAD) network to identify segments with any sound event activity. We conduct an experiment using the DCASE 2016 task 2 dataset to compare our proposed method with typical conventional methods, such as non-negative matrix factorization (NMF) and a standard BLSTM-RNN. Our proposed method outperforms the conventional methods and achieves an F1-score 74.9 % (error rate of 44.7 %) on the event-based evaluation, and an F1-score of 80.5 % (error rate of 33.8 %) on the segment-based evaluation, most of which also outperforms the best reported result in the DCASE 2016 task 2 challenge.
机译:本文介绍了一种新的混合方法,用于复态声音事件检测(SED),其包括基于隐马尔可夫模型(HMM)的时间结构建模技术,其基于双向短期内记忆( BLSTM)经常性神经网络(RNN)。所提出的BLSTM-HMM混合系统使得可以模拟声音事件依赖的时间结构,并且还可以执行逐个序列检测,而无需采用诸如传统帧逐帧方法中的阈值处理。此外,为了有效地减少声音事件的插入误差,这通常发生在嘈杂的条件下,我们还使用声音活动检测(SAD)网络来实现二进制掩码后处理,以识别具有任何声音事件活动的段。我们使用DCEAC 2016任务2 DataSet进行实验,以将所提出的方法与典型的传统方法进行比较,例如非负矩阵分解(NMF)和标准的BLSTM-RNN。我们所提出的方法优于常规方法,在基于事件的评估中实现F1分数74.9%(错误率为44.7%),并在基于分段的评估中的F1分数为80.5%(错误率为33.8%) ,其中大部分也优于DCEAD 2016任务2挑战的最佳报告结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号