首页> 外文会议>European Signal Processing Conference >A low-latency, real-time-capable singing voice detection method with LSTM recurrent neural networks
【24h】

A low-latency, real-time-capable singing voice detection method with LSTM recurrent neural networks

机译:基于LSTM递归神经网络的低延迟,实时的歌声检测方法

获取原文

摘要

Singing voice detection aims at identifying the regions in a music recording where at least one person sings. This is a challenging problem that cannot be solved without analysing the temporal evolution of the signal. Current state-of-the-art methods combine timbral with temporal characteristics, by summarising various feature values over time, e.g. by computing their variance. This leads to more contextual information, but also to increased latency, which is problematic if our goal is on-line, real-time singing voice detection. To overcome this problem and reduce the necessity to include context in the features themselves, we introduce a method that uses Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN). In experiments on several data sets, the resulting singing voice detector outperforms the state-of-the-art baselines in terms of accuracy, while at the same time drastically reducing latency and increasing the time resolution of the detector.
机译:唱歌语音检测旨在识别音乐录音中至少一个人唱歌的区域。这是一个具有挑战性的问题,如果不分析信号的时间演变就无法解决。当前最先进的方法是通过将随时间变化的各种特征值汇总在一起来将音色与时间特征相结合,例如通过计算它们的方差。这将导致更多的上下文信息,但也会导致延迟增加,如果我们的目标是在线实时唱歌语音检测,那么这将是一个问题。为了克服此问题并减少在特征本身中包含上下文的必要性,我们引入了一种使用长短期记忆循环神经网络(LSTM-RNN)的方法。在几个数据集上进行的实验中,最终的歌声检测器在准确性方面优于最新的基线,同时大大减少了等待时间并提高了检测器的时间分辨率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号