BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection

机译：BLSTM-HMM混合系统与音响活动检测网络相结合，用于复态声音事件检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new hybrid approach for polyphonic Sound Event Detection (SED) which incorporates a temporal structure modeling technique based on a hidden Markov model (HMM) with a frame-by-frame detection method based on a bidirectional long short-term memory (BLSTM) recurrent neural network (RNN). The proposed BLSTM-HMM hybrid system makes it possible to model sound event-dependent temporal structures and also to perform sequence-by-sequence detection without having to resort to thresholding such as in the conventional frame-by-frame methods. Furthermore, to effectively reduce insertion errors of sound events, which often occurs under noisy conditions, we additionally implement a binary mask post-processing using a sound activity detection (SAD) network to identify segments with any sound event activity. We conduct an experiment using the DCASE 2016 task 2 dataset to compare our proposed method with typical conventional methods, such as non-negative matrix factorization (NMF) and a standard BLSTM-RNN. Our proposed method outperforms the conventional methods and achieves an F1-score 74.9 % (error rate of 44.7 %) on the event-based evaluation, and an F1-score of 80.5 % (error rate of 33.8 %) on the segment-based evaluation, most of which also outperforms the best reported result in the DCASE 2016 task 2 challenge.

机译：本文介绍了一种新的混合方法，用于复态声音事件检测（SED），其包括基于隐马尔可夫模型（HMM）的时间结构建模技术，其基于双向短期内记忆（ BLSTM）经常性神经网络（RNN）。所提出的BLSTM-HMM混合系统使得可以模拟声音事件依赖的时间结构，并且还可以执行逐个序列检测，而无需采用诸如传统帧逐帧方法中的阈值处理。此外，为了有效地减少声音事件的插入误差，这通常发生在嘈杂的条件下，我们还使用声音活动检测（SAD）网络来实现二进制掩码后处理，以识别具有任何声音事件活动的段。我们使用DCEAC 2016任务2 DataSet进行实验，以将所提出的方法与典型的传统方法进行比较，例如非负矩阵分解（NMF）和标准的BLSTM-RNN。我们所提出的方法优于常规方法，在基于事件的评估中实现F1分数74.9％（错误率为44.7％），并在基于分段的评估中的F1分数为80.5％（错误率为33.8％），其中大部分也优于DCEAD 2016任务2挑战的最佳报告结果。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2017年|651-1302p|共5页
会议地点
作者
Tomoki Hayashi; Shinji Watanabe; Tomoki Toda; Takaaki Hori; Jonathan Le Roux; Kazuya Takeda;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Polyphonic sound event detection; BLSTMHMM; Sound activity detection; Hybrid system;

机译：Polyphonic Sound事件检测;BLSTMHMM;声音活动检测;混合系统;

相似文献

外文文献
中文文献
专利

1. Relational recurrent neural networks for polyphonic sound event detection [J] . Ma Junbo, Wang Ruili, Ji Wanting, Multimedia Tools and Applications . 2019,第20期

机译：关系递归神经网络用于复音事件检测
2. Polyphonic Sound Event Detection by Using Capsule Neural Networks [J] . Vesperini Fabio, Gabrielli Leonardo, Principi Emanuele, Selected Topics in Signal Processing, IEEE Journal of . 2019,第2期

机译：使用胶囊神经网络的复音声音事件检测
3. Polyphonic Sound Event Detection Based on Residual Convolutional Recurrent Neural Network With Semi-Supervised Loss Function [J] . Nam Kyun Kim, Hong Kook Kim Quality Control, Transactions . 2021,第1期

机译：基于半监控损失函数的残余卷积复发性神经网络的复音声事件检测
4. BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection [C] . Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：BLSTM-HMM混合系统与声音活动检测网络相结合，用于复音声音事件检测
5. Sound Event Annotation and Detection with Less Human Effort [D] . Kim, Bongjun. 2020

机译：声音事件注释和检测较少人力努力
6. Robust sound event detection in bioacoustic sensor networks [O] . Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, -1

机译：生物声学传感器网络中可靠的声音事件检测
7. Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection [O] . Çakır, Emre, Parascandolo, Giambattista, Heittola, Toni, 2017

机译：用于复音声音事件的卷积递归神经网络发现

BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection

摘要

著录项

相似文献

相关主题

期刊订阅