...
首页> 外文期刊>Journal of ICT Research and Applications >Robust Automatic Speech Recognition Features using Complex Wavelet Packet Transform Coefficients
【24h】

Robust Automatic Speech Recognition Features using Complex Wavelet Packet Transform Coefficients

机译:使用复数小波包变换系数的鲁棒性自动语音识别功能

获取原文
           

摘要

To improve the performance of phoneme based Automatic Speech Recognition (ASR) in noisy environment; we developed a new technique that could add robustness to clean phonemes features. These robust features are obtained from Complex Wavelet Packet Transform (CWPT) coefficients. Since the CWPT coefficients represent all different frequency bands of the input signal, decomposing the input signal into complete CWPT tree would also cover all frequencies involved in recognition process. For time overlapping signals with different frequency contents, e. g. phoneme signal with noises, its CWPT coefficients are the combination of CWPT coefficients of phoneme signal and CWPT coefficients of noises. The CWPT coefficients of phonemes signal would be changed according to frequency components contained in noises. Since the numbers of phonemes in every language are relatively small (limited) and already well known, one could easily derive principal component vectors from clean training dataset using Principal Component Analysis (PCA). These principal component vectors could be used then to add robustness and minimize noises effects in testing phase. Simulation results, using Alpha Numeric 4 (AN4) from Carnegie Mellon University and NOISEX-92 examples from Rice University, showed that this new technique could be used as features extractor that improves the robustness of phoneme based ASR systems in various adverse noisy conditions and still preserves the performance in clean environments.
机译:在嘈杂的环境中提高基于音素的自动语音识别(ASR)的性能;我们开发了一种新技术,可以为清除音素功能增加鲁棒性。这些强大的功能是从复数小波包变换(CWPT)系数获得的。由于CWPT系数代表输入信号的所有不同频带,因此将输入信号分解为完整的CWPT树也将覆盖识别过程中涉及的所有频率。对于具有不同频率内容的时间重叠信号,例如。 G。带有噪声的音素信号,其CWPT系数是音素信号的CWPT系数和噪声的CWPT系数的组合。音素信号的CWPT系数将根据噪声中包含的频率分量而变化。由于每种语言的音素数量相对较少(有限)并且已经众所周知,因此可以使用主成分分析(PCA)从干净的训练数据集中轻松导出主成分向量。然后可以使用这些主成分矢量来增加鲁棒性,并在测试阶段将噪声影响降至最低。使用卡内基梅隆大学的Alpha Numeric 4(AN4)和莱斯大学的NOISEX-92实例进行的仿真结果表明,该新技术可以用作特征提取器,以提高基于音素的ASR系统在各种不利噪声条件下的鲁棒性,并且仍然保持清洁环境下的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号