首页> 外文期刊>Computer speech and language >Speech enhancement using sparse dictionary learning in wavelet packet transform domain
【24h】

Speech enhancement using sparse dictionary learning in wavelet packet transform domain

机译:小波包变换域中基于稀疏字典学习的语音增强

获取原文
获取原文并翻译 | 示例

摘要

Sparse coding, as a successful representation method for many signals, has been recently employed in speech enhancement. This paper presents a new learning-based speech enhancement algorithm via sparse representation in the wavelet packet transform domain. We propose sparse dictionary learning procedures for training data of speech and noise signals based on a coherence criterion, for each subband of decomposition level. Using these learning algorithms, self-coherence between atoms of each dictionary and mutual coherence between speech and noise dictionary atoms are minimized along with the approximation error. The speech enhancement algorithm is introduced in two scenarios, supervised and semi-supervised. In each scenario, a voice activity detector scheme is employed based on the energy of sparse coefficient matrices when the observation data is coded over corresponding dictionaries. In the proposed supervised scenario, we take advantage of domain adaptation techniques to transform a learned noise dictionary to a dictionary adapted to noise conditions captured based on the test environment circumstances. Using this step, observation data is sparsely coded, based on the current situation of the noisy space, with low sparse approximation error. This technique has a prominent role in obtaining better enhancement results particularly when the noise is non-stationary. In the proposed semi-supervised scenario, adaptive thresholding of wavelet coefficients is carried out based on the variance of the estimated noise in each frame of different subbands. The proposed approaches lead to significantly better speech enhancement results in comparison with the earlier methods in this context and the traditional procedures, based on different objective and subjective measures as well as a statistical test.
机译:稀疏编码,作为许多信号的成功表示方法,最近已经在语音增强中采用。本文提出了一种基于小波包变换域中稀疏表示的基于学习的语音增强算法。我们针对分解级别的每个子带,提出了一种基于相干准则的稀疏词典学习程序,用于训练语音和噪声信号的数据。使用这些学习算法,每个字典的原子之间的自相关性以及语音和噪声字典原子之间的互相关性以及近似误差都会降到最低。在有监督和半监督两种情况下介绍了语音增强算法。在每种情况下,当在相应字典上编码观察数据时,将基于稀疏系数矩阵的能量采用语音活动检测器方案。在提出的监督场景中,我们利用域自适应技术将学习到的噪声字典转换为适合基于测试环境情况捕获的噪声条件的字典。使用该步骤,基于噪声空间的当前情况,以低的稀疏近似误差来稀疏地编码观察数据。该技术在获得更好的增强效果方面具有突出的作用,尤其是在噪声不稳定的情况下。在提出的半监督方案中,基于不同子带的每个帧中估计噪声的方差,对小波系数进行自适应阈值处理。与在此情况下的早期方法和传统方法相比,基于不同的主观和客观测量方法以及统计测试,所提出的方法可以带来更好的语音增强效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号