首页> 外文期刊>International journal of speech technology >Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback-Leibler divergence
【24h】

Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback-Leibler divergence

机译:基于低秩稀疏分解模型的基于伽马通滤波器组和Kullback-Leibler发散的语音增强

获取原文
获取原文并翻译 | 示例
           

摘要

In speech enhancement systems, the key stage is to estimate noise which generally requires prior speech or noise models. However, it is difficult to obtain such prior models sometimes. This paper presents a speech enhancement algorithm which does not require prior knowledge of speech and noise, and is based on low-rank and sparse matrix decomposition model using gammatone filterbank and Kullback–Leibler divergence to estimate noise and speech by decomposing the input noisy speech magnitude spectra into low-rank noise and sparse speech parts, respectively. According to the proposed technique, noise signals are assumed as low-rank components because noise spectra within different time frames are usually highly correlated with each other; while the speech signals are considered as sparse components because they are relatively sparse in time–frequency domain. Based on these assumptions, we have developed an alternative speech enhancement algorithm to separate the speech and noise magnitude spectra by imposing rank and sparsity constraints, with which the enhanced time-domain speech can be constructed from sparse matrix The proposed technique is significantly different from existing speech enhancement techniques as it enhances noisy speech in an uncomplicated manner, without need of noise estimation algorithm to find noise-only excerpts for noise estimation. Moreover, it can obtain improved performance in low SNR conditions, and does not need to know the exact distribution of noise signals. Experimental results have showed that proposed technique can perform better than conventional techniques in many types of strong noise conditions, in terms of yielding less residual noise, lower speech distortion and better overall speech quality. An important improvement in terms of the PESQ, SNRSeg, SIG and BAK is observed with the proposed algorithm over baseline algorithms.
机译:在语音增强系统中,关键阶段是估计通常需要先有语音或噪声模型的噪声。但是,有时很难获得这样的现有模型。本文提出了一种语音增强算法,该算法不需要先验语音和噪声知识,并且基于低秩和稀疏矩阵分解模型,使用伽马通滤波器组和Kullback-Leibler散度通过分解输入噪声语音幅度来估计噪声和语音。频谱分别分为低阶噪声和稀疏语音部分。根据所提出的技术,噪声信号被假定为低阶分量,因为不同时间范围内的噪声频谱通常彼此高度相关;而语音信号则被视为稀疏分量,因为它们在时频域中相对稀疏。基于这些假设,我们开发了一种替代的语音增强算法,通过施加秩和稀疏性约束来分离语音和噪声幅度谱,从而可以从稀疏矩阵构造增强的时域语音。所提出的技术与现有技术有很大不同语音增强技术,因为它以一种简单的方式增强了嘈杂的语音,而无需使用噪声估计算法来找到仅用于噪声估计的噪声摘录。此外,它可以在低SNR条件下获得改进的性能,并且不需要知道噪声信号的确切分布。实验结果表明,所提出的技术在产生大量残留噪声,降低语音失真和改善整体语音质量方面,在许多类型的强噪声条件下可以比传统技术表现更好。与基线算法相比,所提出的算法在PESQ,SNRSeg,SIG和BAK方面均取得了重要的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号