Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses

机译：在幅度和相位响应中使用谱内循环层进行单声道语音增强

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech enhancement has greatly benefited from deep learning. Currently, the best performing deep architectures use long short-term memory (LSTM) recurrent neural networks (RNNs) to model short and long temporal dependencies. These approaches, however, underutilize or ignore spectral-level dependencies within the magnitude and phase responses, respectively. In this paper, we propose a deep learning architecture that leverages both temporal and spectral dependencies within the magnitude and phase responses. More specifically, we first train a LSTM network to predict both the spectral-magnitude response and group delay, where this model captures temporal correlations. We then introduce Markovian recurrent connections in the output layers to capture spectral dependencies within the magnitude and phase responses. We compare our approach with traditional enhancement approaches and approaches that consider spectral dependencies within a single time frame. The results show that considering the within-frame spectral dependencies leads to improvements.

机译：深度学习极大地受益于语音增强。当前，性能最好的深度架构使用长短期记忆（LSTM）递归神经网络（RNN）来建模短时间和长时间依赖性。然而，这些方法分别未充分利用或忽略幅度和相位响应内的频谱水平依赖性。在本文中，我们提出了一种深度学习架构，该架构利用幅度和相位响应内的时间和频谱依赖性。更具体地说，我们首先训练LSTM网络以预测频谱幅度响应和群时延，其中该模型捕获时间相关性。然后，我们在输出层中引入马尔可夫递归连接，以捕获幅度和相位响应内的频谱依赖性。我们将我们的方法与传统的增强方法以及在单个时间范围内考虑频谱相关性的方法进行比较。结果表明，考虑帧内频谱相关性会导致改善。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|6224-6228|共5页
会议地点
作者
Khandokar Md. Nayem; Donald S. Williamson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
speech enhancement; intra-spectral correlations; recurrent neural networks; long short-term memory;

机译：语音增强;谱内相关;递归神经网络;长短期记忆;

相似文献

外文文献
中文文献
专利

1. Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement [J] . Ke Tan, DeLiang Wang Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2020,第期

机译：学习复杂谱映射与门控卷积经常性网络进行单一语音增强
2. Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation [J] . Kim Han-Gyu, Jang Gil-Jin, Oh Yung-Hwan, Journal of supercomputing . 2020,第10期

机译：使用反复性神经网络进行语音和音乐音调轨迹分类，用于单一语音隔离
3. FLGCNN: A novel fully convolutional neural network for end-to-end monaural speech enhancement with utterance-based objective functions [J] . Zhu Yuanyuan, Xu Xu, Ye Zhongfu Applied Acoustics . 2020,第Deca期

机译：FLGCNN：具有基于话语的目标功能的端到端单声道语音增强新颖的全卷积神经网络
4. Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses [C] . Khandokar Md. Nayem, Donald S. Williamson IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：使用幅度和相位响应中的光谱复发层的单声道语音增强
5. Monaural speech segregation in reverberant environments. [D] . Jin, Zhaozhang. 2010

机译：混响环境中的单声道语音隔离。
6. Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension [O] . Jonathan E. Peelle, Joachim Gross, Matthew H. Davis -1

机译：在理解过程中增强了对人类听觉皮层中语音的锁相响应。
7. Speech Enhancement Based on Fusion of Both Magnitude/Phase-Aware Features and Targets [O] . Haitao Lang, Jie Yang 2020

机译：基于额定幅度/相位感知功能和目标的融合语音增强

Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses

摘要

著录项

相似文献

相关主题

期刊订阅