...
首页> 外文期刊>Electronics and Electrical Engineering >A Novel Magnitude-Squared Spectrum Cost Function for Speech Enhancement
【24h】

A Novel Magnitude-Squared Spectrum Cost Function for Speech Enhancement

机译:一种新的幅度平方频谱代价函数,用于语音增强

获取原文
获取原文并翻译 | 示例

摘要

The problem of improving the quality and intelligibility of speech in noisy environments has attracted a great deal of interest in a long time. The existence of noise is inevitable in real-world application of speech processing. In particular, speech coders and speech recognition systems might be rendered useless in the presence of background noise. Numerous techniques have been developed, and conventional speech enhancement algorithms basically consist of four classes of algorithms, including spectral subtraction [1], subspace [2], statistical model based [3] and Wiener filter based algorithms [4]. The well-known Ephraim-Malah algorithm which base on statistical model is an MMSE [3] estimator for the speech DFT amplitude. In this study, we also choose the Byes risk as the basis since it is the most fundamental statistical model approach, and many algorithms are closely connected to this technique [5]. Minimizing the Byes risk for a given cost function results in a variety of estimators. In fact, the maximum a posteriori (MAP) [6] estimator, minimum mean square error (MMSE) and maximum likelihood (ML) [7] estimators can be derived from the different Bayes risk cost functions. Also it is not difficult to notice that the Bayesian estimators based on perceptually motivated cost functions in place of traditional cost function are tightly related to the Byes risk [8-10]. In summary, different Bayesian estimators can be derived depending on the choice of the cost function. In recent years, Yang Lu and Loizou et al [11] propose a new speech enhancement algorithm which assumes that the magnitude-square spectrum of the noisy speech signal can be computed as the sum of the (clean) signal and noise magnitude-squared spectra, and finally, they derive a MMSE-MSS estimator which uses the squared-error cost function. Motivated by the previously mentioned assumption, we derive a novel speech enhancement by using other distortion measure in this paper. Results show that the proposed estimator yielded lower residual noise and lower speech distortion than the conventional MMSE-MSS estimator, in terms of yielding better speech quality. This paper is organized as follows. In section 2, the background information of Bayes risk is given. In section 3, the proposed algorithm is presented. The experimental results of comparing the algorithm proposed in this paper with other algorithms are also presented in section 4. Finally, our work of this paper is summarized in the last section.
机译:长期以来,在嘈杂环境中提高语音质量和清晰度的问题引起了人们的极大兴趣。在语音处理的实际应用中,噪声的存在是不可避免的。特别是,在存在背景噪声的情况下,语音编码器和语音识别系统可能会变得无用。已经开发了许多技术,并且传统的语音增强算法基本上包括四类算法,包括频谱减法[1],子空间[2],基于统计模型的[3]和基于维纳滤波器的算法[4]。基于统计模型的著名Ephraim-Malah算法是语音DFT幅度的MMSE [3]估计器。在这项研究中,由于它是最基本的统计模型方法,因此我们也选择了Byes风险作为基础,并且许多算法都与该技术紧密相关[5]。对于给定的成本函数,将Byes风险最小化会导致各种估计。实际上,可以从不同的贝叶斯风险成本函数中得出最大后验(MAP)[6]估计量,最小均方误差(MMSE)和最大似然(ML)[7]估计量。同样不难发现,基于感知动机的成本函数而不是传统成本函数的贝叶斯估计量与Byes风险紧密相关[8-10]。总之,根据成本函数的选择,可以得出不同的贝叶斯估计量。近年来,Yang Lu和Loizou等人[11]提出了一种新的语音增强算法,该算法假设可以将噪声语音信号的幅度平方频谱计算为(干净)信号和噪声幅度平方频谱的总和。 ,最后,他们得出使用平方误差成本函数的MMSE-MSS估计量。在前面提到的假设的推动下,我们通过使用其他失真度量来推导一种新颖的语音增强。结果表明,与传统的MMSE-MSS估计器相比,所提出的估计器产生了更低的残留噪声和更低的语音失真,从而产生了更好的语音质量。本文的组织如下。在第2节中,给出了贝叶斯风险的背景信息。在第3节中,提出了所提出的算法。第4节还介绍了将本文提出的算法与其他算法进行比较的实验结果。最后,最后一部分总结了本文的工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号