首页> 外文会议>European Conference on Speech Communication and Technology v.3; 20010903-20010907; Aalborg; DK >ROBUST AUTOMATIC SPEECH RECOGNITION IN LOW-SNR CAR ENVIRONMENTS BY THE APPLICATION OF A CONNECTIONIST SUBSPACE-BASED APPROACH TO THE MEL-BASED CEPSTRAL COEFFICIENTS
【24h】

ROBUST AUTOMATIC SPEECH RECOGNITION IN LOW-SNR CAR ENVIRONMENTS BY THE APPLICATION OF A CONNECTIONIST SUBSPACE-BASED APPROACH TO THE MEL-BASED CEPSTRAL COEFFICIENTS

机译:通过将基于子空间的连接方法应用于基于MEL的倒谱系数,实现低信噪比汽车环境中的鲁棒自动语音识别

获取原文
获取原文并翻译 | 示例

摘要

In this paper, the problem of robust large-vocabulary continuous-speech recognition (CSR) in the presence of highly interfering car noise has been considered. Our approach is based on the noise reduction of the parameters that we use for recognition, that is, the Mel-based cepstral coefficients. This is achieved by the use of a Multilayer Perceptron (MLP) network for noise reduction in the cepstral domain in order to get less-variant parameters. Then, the obtained enhanced features are refined via the Karhunen-Loeve Transform (KLT) implemented using the Principal Component Analysis (PCA). Experiments show that the use of the enhanced parameters using such an approach increases the recognition rate of the CSR process in highly interfering car noise environments. The HTK Hidden Markov Model Toolkit was used throughout our experiments. Results show that the proposed hybrid technique when included in the front-end of an HTK-based CSR system, outperforms that of the conventional recognition process based on either a KLT- or an MLP-based preprocessing recognition in severe interfering car noise environments for a wide range of SNRs varying from 16 dB to -4 dB using a noisy version of the TIMIT database.
机译:在本文中,已经考虑了在强烈干扰汽车噪声的情况下鲁棒的大词汇量连续语音识别(CSR)问题。我们的方法基于用于识别的参数(即基于梅尔的倒谱系数)的降噪。这是通过使用多层感知器(MLP)网络来降低倒频谱域中的噪声来实现的,以获取变化较小的参数。然后,通过使用主成分分析(PCA)实施的Karhunen-Loeve变换(KLT)精炼获得的增强特征。实验表明,在高度干扰的汽车噪声环境中,使用这种方法使用增强参数可以提高CSR过程的识别率。我们在整个实验过程中都使用了HTK隐马尔可夫模型工具包。结果表明,所提出的混合技术在基于HTK的CSR系统的前端中使用时,在严重干扰汽车噪声环境中,优于基于KLT或MLP预处理识别的常规识别过程。使用TIMIT数据库的嘈杂版本,SNR范围从16 dB到-4 dB不等。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号