首页> 外文会议>International conference on latent variable analysis and signal separation >Improving Reverberant Speech Separation with Binaural Cues Using Temporal Context and Convolutional Neural Networks
【24h】

Improving Reverberant Speech Separation with Binaural Cues Using Temporal Context and Convolutional Neural Networks

机译:使用时间上下文和卷积神经网络改善双耳线索的回响语音分离

获取原文

摘要

Given binaural features as input, such as interaural level difference and interaural phase difference, Deep Neural Networks (DNNs) have been recently used to localize sound sources in a mixture of speech signals and/or noise, and to create time-frequency masks for the estimation of the sound sources in reverberant rooms. Here, we explore a more advanced system, where feed-forward DNNs are replaced by Convolutional Neural Networks (CNNs). In addition, the adjacent frames of each time frame (occurring before and after this frame) are used to exploit contextual information, thus improving the localization and separation for each source. The quality of the separation results is evaluated in terms of Signal to Distortion Ratio (SDR).
机译:给定双耳特征作为输入,例如耳间电平差和耳间相位差,最近已使用深度神经网络(DNN)在混合语音信号和/或噪声的情况下定位声源,并为音频信号创建时频掩码估计混响室内的声源。在这里,我们探索了一个更高级的系统,其中前馈DNN被卷积神经网络(CNN)取代。另外,每个时间帧的相邻帧(在该帧之前和之后发生)都用于利用上下文信息,从而改善了每个源的定位和分离。分离结果的质量根据信号失真比(SDR)进行评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号