...
首页> 外文期刊>Journal of supercomputing >Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation
【24h】

Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation

机译:使用反复性神经网络进行语音和音乐音调轨迹分类,用于单一语音隔离

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we propose speech/music pitch classification based on recurrent neural network (RNN) for monaural speech segregation from music interferences. The speech segregation methods in this paper exploit sub-band masking to construct segregation masks modulated by the estimated speech pitch. However, for speech signals mixed with music, speech pitch estimation becomes unreliable, as speech and music have similar harmonic structures. In order to remove the music interference effectively, we propose an RNN-based speech/music pitch classification. Our proposed method models the temporal trajectories of speech and music pitch values and determines an unknown continuous pitch sequence as belonging to either speech or music. Among various types of RNNs, we chose simple recurrent network, long short-term memory (LSTM), and bidirectional LSTM for pitch classification. The experimental results show that our proposed method significantly outperforms the baseline methods for speech-music mixtures without loss of segregation performance for speech-noise mixtures.
机译:在本文中,我们提出了基于经常性神经网络(RNN)的语音/音乐间距分类,用于从音乐干扰的单一语音隔离。本文中的语音分离方法利用子带掩模来构建由估计的语音间距调制的分离掩模。然而,对于与音乐混合的语音信号,语音间距估计变得不可靠,因为语音和音乐具有相似的谐波结构。为了有效地去除音乐干扰,我们提出了基于RNN的语音/音乐间距分类。我们所提出的方法模拟语音和音乐间距值的时间轨迹,并确定属于语音或音乐的未知连续间距序列。在各种类型的RNN中,我们选择简单的复发网络,长短期存储器(LSTM)和Bidirectional LSTM进行间距分类。实验结果表明,我们提出的方法显着优于语音音乐混合物的基线方法,而不会损失语音噪声混合物的分离性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号