首页> 外文期刊>International journal of speech technology >Performance analysis of neural network, NMF and statistical approaches for speech enhancement
【24h】

Performance analysis of neural network, NMF and statistical approaches for speech enhancement

机译:神经网络,NMF和语音增强统计方法的性能分析

获取原文
获取原文并翻译 | 示例
       

摘要

Bayesian Estimators are very useful in speech enhancement and noise reduction. But, it is noted that the traditional estimators process only amplitudes and the phase is left unprocessed. Among the Bayesian estimators, Super- Gaussian based estimators provide improved noise reduction. Super-Gaussian Bayesian estimators, which uses processed phase information for estimation of amplitudes provides further improved results. In this work, the Complex speech coefficients given Uncertain Phase (CUP) based Bayesian estimators like CUP-GG (CUP Estimator with speech spectral coefficients assumed as Gamma and noise spectral coefficients as Generalized Gamma), CUP-NG (Speech as Nakagami) are compared under white noise, pink noise, Babble noise and Non-Stationary factory noise conditions. The statistical estimators show less effective results under completely non-stationary assumptions like non-stationary factory noise, babble noise etc. Non-negative Matrix Factorization (NMF) based algorithms show better performance for non stationary noises. The drawback of NMF is, it requires apriori knowledge about speech. This drawback can be overcome by taking the advantages of both statistical approaches and NMF approaches. NR-NMF and WR-NMF speech enhancement methods are developed by providing posteriori regularization based on statistical assumption of speech and noise DFT coefficients distribution. Also a speech enhancement method which uses CUP-GG estimator and NMF with online noise bases update are considered for comparison. The progress in neural network based approaches for speech enhancement further shown that with large dataset and better training, the speech enhancement algorithms results in improved results. In this work, the neural network approach for speech enhancement is implemented and compared the method with traditional estimators and NMF approaches. For generalization of unseen noise types the proposed neural network approach uses dropout. Also for training the network, the features obtained from apriori SNR and aposteriori SNR is used in this method. The objective of this paper is to analyze the performance of speech enhancement methods based on Neural Network, NMF and statistical based. The objective performance measures Perceptual Evaluation of Speech Quality (PESQ), Short-Time Objective Intelligibility (STOI), Signal to Noise Ratio (SNR), Segmental SNR (Seg SNR) are considered for comparison.
机译:贝叶斯估算器在语音增强和降噪方面非常有用。但是,有人指出,传统的估计器仅处理幅度和阶段是未加工的。在贝叶斯估计器中,基于超高斯的估算器提供了改善的降噪。使用加工阶段信息估计幅度的超高斯贝叶斯估算器提供了进一步的改进的结果。在这项工作中,复杂的语音系数给出了基于不确定的相位(杯子)基于Cup-GG(带有伽马谱系数的杯估计器作为伽马和噪声谱系数作为广义伽玛的杯子),比较了Cup-NG(作为Nakagami的语音)在白噪声下,粉红色噪音,禁止噪声和非静止的工厂噪音条件。统计估算器在完全非静止假设下表现出较差的有效结果,如非静止的工厂噪声,禁止噪声等非负矩阵分解(NMF)的算法显示出更好的非静止噪声性能。 NMF的缺点是,它需要关于语音的Apriori知识。通过采取统计方法和NMF方法的优点,可以克服该缺点。通过基于语音和噪声DFT系数分布的统计假设提供后验规范化来开发NR-NMF和WR-NMF语音增强方法。此外,还考虑了使用Cup-GG估计器和NMF与在线噪声基础更新的语音增强方法进行比较。基于神经网络的语音增强方法的进步进一步示出了大量数据集和更好的训练,语音增强算法导致改进的结果。在这项工作中,实现了语音增强的神经网络方法,并将其与传统估算器和NMF方法进行了比较。对于看不见的噪声类型的概括,所提出的神经网络方法使用辍学。此外,还用于培训网络,在该方法中使用了从APRiori SNR和Aposteriori SNR获得的功能。本文的目的是分析基于神经网络,NMF和统计基于统计的语音增强方法的性能。客观性能测量言语质量(PESQ)的感知评估(PESQ),短时间客观智能性(STOI),信噪比(SNR),分段SNR(SEG SNR)进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号