首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses
【24h】

Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses

机译:在幅度和相位响应中使用谱内循环层进行单声道语音增强

获取原文

摘要

Speech enhancement has greatly benefited from deep learning. Currently, the best performing deep architectures use long short-term memory (LSTM) recurrent neural networks (RNNs) to model short and long temporal dependencies. These approaches, however, underutilize or ignore spectral-level dependencies within the magnitude and phase responses, respectively. In this paper, we propose a deep learning architecture that leverages both temporal and spectral dependencies within the magnitude and phase responses. More specifically, we first train a LSTM network to predict both the spectral-magnitude response and group delay, where this model captures temporal correlations. We then introduce Markovian recurrent connections in the output layers to capture spectral dependencies within the magnitude and phase responses. We compare our approach with traditional enhancement approaches and approaches that consider spectral dependencies within a single time frame. The results show that considering the within-frame spectral dependencies leads to improvements.
机译:深度学习极大地受益于语音增强。当前,性能最好的深度架构使用长短期记忆(LSTM)递归神经网络(RNN)来建模短时间和长时间依赖性。然而,这些方法分别未充分利用或忽略幅度和相位响应内的频谱水平依赖性。在本文中,我们提出了一种深度学习架构,该架构利用幅度和相位响应内的时间和频谱依赖性。更具体地说,我们首先训练LSTM网络以预测频谱幅度响应和群时延,其中该模型捕获时间相关性。然后,我们在输出层中引入马尔可夫递归连接,以捕获幅度和相位响应内的频谱依赖性。我们将我们的方法与传统的增强方法以及在单个时间范围内考虑频谱相关性的方法进行比较。结果表明,考虑帧内频谱相关性会导致改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号