首页> 中文期刊> 《电子学报》 >基于高斯混合模型的压缩域语音增强方法

基于高斯混合模型的压缩域语音增强方法

         

摘要

为了有效利用纯净语音导抗谱频率参数( ISFs)的先验知识,本文针对ITU-TG.722.2宽带语音编码标准提出了一种基于高斯混合模型的压缩域语音增强方法.首先,将含噪语音、纯净语音的导抗谱频率参数,以及对应的增益调整因子构成特征矢量,并利用高斯混合模型拟合其概率密度;然后,在最小均方误差( MMSE)准则下对纯净语音的特征参数进行最优贝叶斯估计.为了兼容编码器中的非连续性传输模式,当处理信号为非语音信息时,算法在保持噪声帧谱包络参数不变的前提下,按固定比例调整对数帧能量;且若出现帧擦除情况,算法不调整接收到的码流,并按正常帧处理方式调整恢复后的参数以更新相关历史.本文采用ITU-TG.160标准进行了性能测试,结果表明,与参考方法相比,所提方法在保证信噪比提高能力的同时,可以达到更大的噪声衰减量,且增强语音的客观质量更优.%A Gaussian Mixture Model (GMM) based speech enhancement method in compressed domain used for ITU-T G. 722.2 wideband speech codec is proposed to take full advantage of the prior knowledge of the Immittance Spectral Frequencies (IS-Fs) for the clean speech. Firstly, GMM is adopted to model the joint probability density of feature vectors which are composed by the ISFs of noisy speech and clean speech with the corresponding gain scaling factor. Secondly, an optimal Bayesian estimation of feature parameters derived from clean speech is obtained under the minimum mean square error (MMSE) criterion. To be compatible with the DTX (Discontinuous Transmission) mode,the logarithmic energy is attenuated and the ISFs remain when a SID (Silence Insertion Descriptor) frame is received.Furthermore,if ao erased frame is received,the bit stream is unchanged and the proposed method is performed on the recovered parameters for the memory update.The evaluation is conducted under the ITU-T G. 160. The results indicate that,comparing with the reference method,the proposed method can produce larger amount of noise level reduction with better objective speech quality, while the SNR improvement remains acceptable.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号