首页> 外文会议>AES international conference >Extraction of Spectro-Temporal Speech Cues for Robust Automatic Speech Recognition
【24h】

Extraction of Spectro-Temporal Speech Cues for Robust Automatic Speech Recognition

机译:光谱-时间语音提示的提取,用于鲁棒的自动语音识别

获取原文

摘要

This work analyzes the use of spectro-temporal signal characteristics with the aim of improving the robustness of automatic speech recognition (ASR) systems. Experiments that aim at the robustness against extrinsic sources of variability (such as additive noise) as well as intrinsic variation of speech (changes in speaking rate, style, and effort) are presented. Results are compared to scores for the most common features in ASR (mel-frequency cepstral coefficients and perceptual linear prediction features), which account for the spectral properties of short-time segments of speech, but mostly neglect temporal or spectro-temporal cues. Intrinsic variations were found to severely degrade the overall ASR performance. The performance of the two most common feature types was degraded in much the same way, whereas the proposed spectro-temporal features exhibit a different sensitivity against intrinsic variations, which suggests that classic and spectro-temporal feature types carry complementary information. Furthermore, spectro-temporal features were shown to be more robust than the baseline system in the presence of additive noise.
机译:这项工作分析了频谱时态信号特性的使用,旨在提高自动语音识别(ASR)系统的鲁棒性。提出了旨在针对外部可变性(例如加性噪声)以及语音的固有变化(说话率,风格和努力程度的变化)的鲁棒性的实验。将结果与ASR中最常见特征(梅尔频率倒谱系数和感知线性预测特征​​)的分数进行比较,这些分数说明了短时语音片段的频谱特性,但大多忽略了时间或频谱时间提示。发现内在变化会严重降低整体ASR性能。两种最常见的特征类型的性能以几乎相同的方式降级,而建议的光谱时态特征对内在变化表现出不同的敏感性,这表明经典和光谱时特征类型具有互补信息。此外,在存在附加噪声的情况下,光谱时态特征显示比基线系统更健壮。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号