【24h】

Front-End Compensation Methods for LVCSR Under Lombard Effect

机译:伦巴德效应下LVCSR的前端补偿方法

获取原文

摘要

This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various levels of SNR. An extension of a bottleneck (BN) front-end utilizing normalization of both critical band energies (CRBE) and BN outputs is proposed and shown to provide a competitive performance compared to the best MFCC-based system. A novel MFCC-based BN front-end is introduced and shown to outperform all other systems in all conditions considered (average 4.1% absolute WER reduction over the second best system). Additionally, two phenomena are observed: (i) combination of cepstral mean subtraction and recently established RASTAip filtering significantly reduces transient effects of RASTA band-pass filtering and increases ASR robustness to noise and LE; (ii) histogram equalization may benefit from utilizing reference distributions derived from pre-normalized rather than raw training features, and also from adopting distributions from different front-ends.
机译:这项研究分析了嘈杂的背景变化和朗伯效应(LE)对大词汇量连续语音识别(LVCSR)的影响。在来自UT-Scope数据库的中性和伦巴底语语音上,以几种不同SNR水平的两种背景噪声,对几种前端特征提取策略与最新特征分布归一化相结合的稳健性进行了测试。提出了利用临界频带能量(CRBE)和BN输出的归一化来扩展瓶颈(BN)前端的方法,并显示出与基于MFCC的最佳系统相比,该技术具有竞争优势。引入了一种新颖的基于MFCC的BN前端,该前端在所有考虑的条件下均表现出优于所有其他系统的性能(相对于第二好的系统,平均WER降低了4.1%)。此外,观察到两种现象:(i)倒谱均值减法和最近建立的RASTAip滤波的组合显着降低了RASTA带通滤波的瞬态效应,并提高了ASR对噪声和LE的鲁棒性; (ii)直方图均衡化可能会受益于利用源自预规范化而非原始训练特征的参考分布,也可能受益于采用来自不同前端的分布。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号