首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Enhancement and Noise Statistics Estimation for Non-Stationary Voiced Speech
【24h】

Enhancement and Noise Statistics Estimation for Non-Stationary Voiced Speech

机译:非平稳语音的增强和噪声统计估计

获取原文
获取原文并翻译 | 示例

摘要

In this paper, single channel speech enhancement in the time domain is considered. We address the problem of modelling non-stationary speech by describing the voiced speech parts by a harmonic linear chirp model instead of using the traditional harmonic model. This means that the speech signal is not assumed stationary, instead the fundamental frequency can vary linearly within each frame. The linearly constrained minimum variance (LCMV) filter and the amplitude and phase estimation (APES) filter are derived in this framework and compared to the harmonic versions of the same filters. It is shown through simulations on synthetic and speech signals, that the chirp versions of the filters perform better than their harmonic counterparts in terms of output signal-to-noise ratio (SNR) and signal reduction factor. For synthetic signals, the output SNR for the harmonic chirp APES based filter is increased 3 dB compared to the harmonic APES based filter at an input SNR of 10 dB, and at the same time the signal reduction factor is decreased. For speech signals, the increase is 1.5 dB along with a decrease in the signal reduction factor of 0.7. As an implicit part of the APES filter, a noise covariance matrix estimate is obtained. We suggest using this estimate in combination with other filters such as the Wiener filter. The performance of the Wiener filter and LCMV filter are compared using the APES noise covariance matrix estimate and a power spectral density (PSD) based noise covariance matrix estimate. It is shown that the APES covariance matrix works well in combination with the Wiener filter, and the PSD based covariance matrix works well in combination with the LCMV filter.
机译:在本文中,考虑了时域中的单通道语音增强。通过使用谐波线性线性调频模型而不是使用传统的谐波模型来描述浊语音部分,我们解决了建模非平稳语音的问题。这意味着语音信号不会被认为是平稳的,而是基频可以在每个帧内线性变化。在此框架中得出了线性约束最小方差(LCMV)滤波器和幅度和相位估计(APES)滤波器,并将它们与相同滤波器的谐波版本进行了比较。通过对合成和语音信号的仿真显示,在输出信噪比(SNR)和信号降低因子方面,滤波器的线性调频版本性能优于其谐波同类产品。对于合成信号,在输入SNR为10 dB的情况下,基于谐波线性调频APES的滤波器的输出SNR与基于谐波线性调频APES的滤波器相比增加了3 dB,同时降低了信号衰减系数。对于语音信号,增加幅度为1.5 dB,而信号降低因子减少为0.7。作为APES滤波器的隐式部分,获得了噪声协方差矩阵估计。我们建议将此估算值与其他过滤器(例如维纳过滤器)结合使用。使用APES噪声协方差矩阵估计和基于功率谱密度(PSD)的噪声协方差矩阵估计来比较Wiener滤波器和LCMV滤波器的性能。结果表明,APES协方差矩阵与维纳滤波器结合使用效果很好,而基于PSD的协方差矩阵与LCMV滤波器结合使用效果很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号