首页> 外文OA文献 >A novel expectation-maximization framework for speech enhancement in non-stationary noise environments
【2h】

A novel expectation-maximization framework for speech enhancement in non-stationary noise environments

机译:一种用于非平稳噪声环境中语音增强的新型期望最大化框架

摘要

Voiced speeches have a quasi-periodic nature that allows them to be compactly represented in the cepstral domain. It is a distinctive feature compared with noises. Recently, the temporal cepstrum smoothing (TCS) algorithm was proposed and was shown to be effective for speech enhancement in non-stationary noise environments. However, the missing of an automatic parameter updating mechanism limits its adaptability to noisy speeches with abrupt changes in SNR across time frames or frequency components. In this paper, an improved speech enhancement algorithm based on a novel expectation-maximization (EM) framework is proposed. The new algorithm starts with the traditional TCS method which gives the initial guess of the periodogram of the clean speech. It is then applied to an L1 norm regularizer in the M-step of the EM framework to estimate the true power spectrum of the original speech. It in turn enables the estimation of the a-priori SNR and is used in the E-step, which is indeed a logmmse gain function, to refine the estimation of the clean speech periodogram. The M-step and E-step iterate alternately until converged. A notable improvement of the proposed algorithm over the traditional TCS method is its adaptability to the changes (even abrupt changes) in SNR of the noisy speech. Performance of the proposed algorithm is evaluated using standard measures based on a large set of speech and noise signals. Evaluation results show that a significant improvement is achieved compared to conventional approaches especially in non-stationary noise environment where most conventional algorithms fail to perform.
机译:有声语音具有准周期性质,可以使它们在倒谱域中紧凑地表示。与噪音相比,这是一个独特的功能。最近,提出了时间倒频谱平滑(TCS)算法,该算法被证明对于非平稳噪声环境中的语音增强有效。但是,缺少自动参数更新机制会限制其对在整个时间帧或频率分量中SNR突然变化的嘈杂语音的适应性。本文提出了一种基于期望最大化框架的改进语音增强算法。新算法从传统的TCS方法开始,该方法给出了干净语音的周期图的初始猜测。然后将其应用于EM框架的M步中的L1范数正则化器,以估计原始语音的真实功率谱。反过来,它使得能够估计先验SNR,并在E步骤(实际上是对数增益函数)中使用它来完善纯净语音周期图的估计。 M步和E步交替迭代直到收敛。与传统的TCS方法相比,该算法的显着改进是其对噪声语音信噪比变化(甚至是突变)的适应性。使用基于大量语音和噪声信号的标准度量来评估所提出算法的性能。评估结果表明,与传统方法相比,特别是在大多数常规算法无法执行的非平稳噪声环境中,实现了显着改进。

著录项

  • 作者

    Lun DP; Shen TW; Ho KC;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号