...
首页> 外文期刊>Multimedia Tools and Applications >Histogram equalization of contextual statistics of speech features for robust speech recognition
【24h】

Histogram equalization of contextual statistics of speech features for robust speech recognition

机译:语音特征语境统计的直方图均衡化,可增强语音识别能力

获取原文
获取原文并翻译 | 示例
           

摘要

In the recent past, we have witnessed a flurry of research activity aimed at the development of novel and ingenious robustness methods for automatic speech recognition (ASR). Among them, histogram equalization (HEQ) of speech features constitutes one most prominent and successful line of research due to its inherent neat formulation and remarkable performance. In this paper, we adopt an effective modeling framework for joint equalization of spatial-temporal contextual statistics of speech features. On top of that, we explore various combinations of simple differencing and averaging operations to render the contextual relationships of feature vector components, not only between different dimensions but also between consecutive speech frames, in the HEQ process. Furthermore, several variants of HEQ are investigated and integrated into the proposed modeling framework to efficiently compensate for the effects of noise interference on the feature vector components. In addition, the utilities of the methods deduced from this framework and several existing robustness methods are analyzed and compared extensively. All experiments were carried out on the Aurora-2 database and task, and were further verified on the Aurora-4 database and task. Empirical experimental results suggest that our proposed methods can offer substantial improvements over the baseline system and achieve performance competitive to or better than some of the existing noise robustness methods in speech recognition.
机译:在最近的过去,我们目睹了一系列的研究活动,这些活动旨在开发新颖而新颖的自动语音识别(ASR)鲁棒性方法。其中,语音特征的直方图均衡化(HEQ)由于其固有的简洁表述和出色的性能而构成了最重要和成功的研究领域。在本文中,我们采用有效的建模框架对语音特征的时空上下文统计进行联合均衡。最重要的是,我们探索了简单的微分和求平均值操作的各种组合,以在HEQ过程中呈现特征向量分量的上下文关系,不仅在不同维度之间,而且在连续语音帧之间。此外,研究了HEQ的几种变体,并将其集成到提出的建模框架中,以有效补偿噪声干扰对特征向量分量的影响。此外,从该框架推论出的方法的实用性和几种现有的鲁棒性方法也得到了广泛的分析和比较。所有实验均在Aurora-2数据库和任务上进行,并在Aurora-4数据库和任务上进行了进一步验证。实验结果表明,我们提出的方法可以在基线系统上提供实质性的改进,并且在语音识别方面可以与某些现有的噪声鲁棒性方法竞争甚至更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号