Combining adaptive sparse NMF feature extraction and soft mask to optimize DNN for speech enhancement

Jia Hairong; Wang Weimei; Mei Shulin

首页> 外文期刊>Applied Acoustics >Combining adaptive sparse NMF feature extraction and soft mask to optimize DNN for speech enhancement

【24h】

Combining adaptive sparse NMF feature extraction and soft mask to optimize DNN for speech enhancement

机译：组合自适应稀疏NMF特征提取和软掩码，优化DNN进行语音增强

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In masking-based deep neural network (DNN) speech enhancement, the time-frequency masking value cannot be estimated accurately because the potential structure information of speech is ignored. In this paper, a speech enhancement method is proposed by combining adaptive sparse non-negative matrix factorization (NMF) feature extraction and soft mask to optimize DNN, using the advantages of the sparse matrix in catching the protruding structure of speech and combining with optimized masking-based prediction. First, considering the dominance of speech and noise interference in different noisy speech signals, this paper proposes a new method for estimating soft mask value, and the initial soft mask value is estimated by using speech cochleagram and noise cochleagram. Then, speech cochleagram and noise cochleagram are learned separately by the sparse NMF (SNMF) to obtain a joint dictionary. The noisy speech is sparsely represented on the joint dictionary, and the adaptive adjustment factor related to the changes of speech and noise dictionary is added to obtain the sparse coefficient. The sparse coefficient is used as the input of the DNN model, and the initial soft mask value is used as the learning label to estimate the final soft mask value. Finally, the estimated soft mask value is combined with the noisy speech cochleagram to obtain enhanced speech. Compared with other methods, the results show that 1.6039 dB increases the average signal-to-noise ratio (SNR) of the proposed method, the average perceptual evaluation of speech quality (PESQ) is increased by 0.1994, and the average short-time objective intelligibility (STOI) is improved by 0.0271, which fully illustrate the superiority of the proposed algorithm. (C) 2020 Elsevier Ltd. All rights reserved.

机译：在基于掩蔽的深神经网络（DNN）语音增强中，不能准确地估计时频屏蔽值，因为忽略了语音的潜在结构信息。在本文中，通过组合自适应稀疏非负矩阵分子分解（NMF）特征提取和软掩模来优化DNN的语音增强方法，利用稀疏矩阵捕获语音突出结构的优点，以及用优化掩模结合基于预测。首先，考虑到不同嘈杂的语音信号中言语和噪声干扰的主导地位，本文提出了一种估计软掩模值的新方法，通过使用语音脚踏板和噪声脚踏板估计初始软掩模值。然后，通过稀疏NMF（SNMF）分开学习语音脚踏板和噪声脚踏板以获得联合字典。嘈杂的语音在联合字典上略微表示，并且添加与语音和噪声字典的变化相关的自适应调整因子以获得稀疏系数。稀疏系数用作DNN模型的输入，初始软掩模值用作学习标签以估计最终软掩码值。最后，估计的软掩模值与嘈杂的语音Cochleegram相结合以获得增强的语音。与其他方法相比，结果表明，1.6039dB增加了所提出的方法的平均信噪比（SNR），语音质量（PESQ）的平均感知评估增加了0.1994，平均短时间目标可懂度（STOI）得到0.0271的改善，这完全说明了所提出的算法的优越性。（c）2020 elestvier有限公司保留所有权利。

著录项

来源
《Applied Acoustics》 |2021年第1期|107666.1-107666.7|共7页
作者
Jia Hairong; Wang Weimei; Mei Shulin;
展开▼
作者单位

Taiyuan Univ Technol Coll Informat & Comp Jinzhong 030600 Peoples R China;

Taiyuan Univ Technol Coll Informat & Comp Jinzhong 030600 Peoples R China;

Taiyuan Univ Technol Coll Informat & Comp Jinzhong 030600 Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep Neural Network (DNN); Sparse Non-negative Matrix Factorization (SNMF); Feature extraction; Soft mask; Cochleagram; Sparse coefficient;

机译：深神经网络（DNN）;稀疏的非负矩阵分解（SNMF）;特征提取;软片;脚踏板;稀疏系数;

相似文献

外文文献
中文文献
专利

1. EEG signal denoising algorithm based on adaptive sparse NMF feature extraction and soft mask optimization DNN [J] . Basic & clinical pharmacology & toxicology. . 2019,第S10期

机译：基于Adaptive Sparse NMF特征提取和软掩模优化DNN的EEG信号去噪算法
2. EEG signal denoising algorithm based on adaptive sparse NMF feature extraction and soft mask optimization DNN [J] . Jia Hairong, Wang Weimei, Mei Shulin, Basic & clinical pharmacology & toxicology. . 2019,第S1期

机译：基于Adaptive Sparse NMF特征提取和软掩模优化DNN的EEG信号去噪算法
3. DNN-based speech enhancement using soft audible noise masking for wind noise reduction [J] . Haichuan Bai, Fengpei Ge, Yonghong Yan Communications, China . 2018,第9期

机译：基于DNN的语音增强功能，使用软可听噪声掩膜来减少风噪声
4. Feature enhancement using sparse reference and estimated soft-mask exemplar-pairs for noisy speech recognition [C] . Tan Lee Ngee, Alwan Abeer IEEE International Conference on Acoustics, Speech and Signal Processing . 2014

机译：使用稀疏参考和估计的软蒙版示例对进行特征增强，以进行嘈杂的语音识别
5. THE EXTRACTION OF FEATURES FROM A SPEECH SIGNAL CORRUPTED BY ADDITIVE NOISE AND THEIR USE FOR SPEECH ENHANCEMENT. [D] . WALICKI, JACEK STANISLAW. 1981

机译：从具有附加噪声的语音信号中提取特征，并将其用于语音增强。
6. Resonance-based sparse adaptive variational mode decomposition and its application to the feature extraction of planetary gearboxes [O] . Jing Zhu, Aidong Deng, Jing Li, 2020

机译：基于共振的稀疏自适应变分模式分解及其在行星齿轮箱特征提取中的应用
7. FEATURE ENHANCEMENT USING SPARSE REFERENCE AND ESTIMATED SOFT-MASK EXEMPLAR-PAIRS FOR NOISY SPEECH RECOGNITION [O] . Lee Ngee, Tan Abeer Alwan 2015

机译：使用稀疏参考和估计的软掩体示例对进行噪声识别的功能增强

Combining adaptive sparse NMF feature extraction and soft mask to optimize DNN for speech enhancement

摘要

著录项

相似文献

相关主题

期刊订阅