...
首页> 外文期刊>Computer speech and language >Non-linear feature extraction for robust speech recognition in stationary and non-stationary noise
【24h】

Non-linear feature extraction for robust speech recognition in stationary and non-stationary noise

机译:非线性特征提取可在平稳和非平稳噪声中实现鲁棒的语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

An analysis-based non-linear feature extraction approach is proposed, inspired by a model of how speech amplitude spectra are affected by additive noise. Acoustic features are extracted based on the noise robust parts of speech spectra without losing discriminative information. Two non-linear processing methods, harmonic demodulation and spectral peak-to-valley ratio locking, are designed to minimize mismatch between clean and noisy speech features. A previously studied method, peak isolation [IEEE Transactions on Speech and Audio Processing 5 (1997) 451], is also discussed with this model. These methods do not require noise estimation and are effective in dealing with both stationary and non-stationary noise. In the presence of additive noise, ASR experiments show that using these techniques in the computation of MFCCs improves recognition performance greatly. For the TI46 isolated digits database, the average recognition rate across several SNRs is improved from 60% (using unmodified MFCCs) to 95% (using the proposed techniques) with additive speech-shaped noise. For the Aurora 2 connected digit-string database, the average recognition rate across different noise types, including non-stationary noise background, and SNRs improves from 58% to 80%.
机译:提出了一种基于分析的非线性特征提取方法,该方法的灵感来自于语音幅度频谱如何受到加性噪声影响的模型。基于语音频谱的鲁棒性部分提取声学特征,而不会丢失判别信息。设计了两种非线性处理方法,即谐波解调和频谱峰谷比锁定,以最大程度地减少纯净和嘈杂的语音特征之间的不匹配。该模型还讨论了一种以前研究的方法,即峰值隔离[IEEE Transactions on Speech and Audio Processing 5(1997)451]。这些方法不需要噪声估计,并且在处理固定噪声和非固定噪声时均有效。在存在附加噪声的情况下,ASR实验表明,在MFCC的计算中使用这些技术可以大大提高识别性能。对于TI46孤立数字数据库,具有多个SNR形噪声的情况下,多个SNR的平均识别率从60%(使用未修改的MFCC)提高到95%(使用建议的技术)。对于Aurora 2连接的数字字符串数据库,包括非平稳噪声背景和SNR在内的不同噪声类型的平均识别率从58%提高到80%。

著录项

  • 来源
    《Computer speech and language》 |2003年第4期|p. 381-402|共22页
  • 作者

    Qifeng Zhu; Abeer Alwan;

  • 作者单位

    Department of Electrical Engineering, The Henry Samuli School of Engineering and Applied Science, 66-147E Engr. IV, UCLA 405 Hilgard Avenue, Box 951594, Los Angeles, CA 90095-1594, USA;

    Department of Electrical Engineering, The Henry Samuli School of Engineering and Applied Science, 66-147E Engr. IV, UCLA 405 Hilgard Avenue, Box 951594, Los Angeles, CA 90095-1594, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号