Histogram equalization of contextual statistics of speech features for robust speech recognition

Hsieh Hsin-Ju; Chen Berlin; Hung Jeih-weih

首页> 外文期刊>Multimedia Tools and Applications >Histogram equalization of contextual statistics of speech features for robust speech recognition

【24h】

Histogram equalization of contextual statistics of speech features for robust speech recognition

机译：语音特征语境统计的直方图均衡化，可增强语音识别能力

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the recent past, we have witnessed a flurry of research activity aimed at the development of novel and ingenious robustness methods for automatic speech recognition (ASR). Among them, histogram equalization (HEQ) of speech features constitutes one most prominent and successful line of research due to its inherent neat formulation and remarkable performance. In this paper, we adopt an effective modeling framework for joint equalization of spatial-temporal contextual statistics of speech features. On top of that, we explore various combinations of simple differencing and averaging operations to render the contextual relationships of feature vector components, not only between different dimensions but also between consecutive speech frames, in the HEQ process. Furthermore, several variants of HEQ are investigated and integrated into the proposed modeling framework to efficiently compensate for the effects of noise interference on the feature vector components. In addition, the utilities of the methods deduced from this framework and several existing robustness methods are analyzed and compared extensively. All experiments were carried out on the Aurora-2 database and task, and were further verified on the Aurora-4 database and task. Empirical experimental results suggest that our proposed methods can offer substantial improvements over the baseline system and achieve performance competitive to or better than some of the existing noise robustness methods in speech recognition.

机译：在最近的过去，我们目睹了一系列的研究活动，这些活动旨在开发新颖而新颖的自动语音识别（ASR）鲁棒性方法。其中，语音特征的直方图均衡化（HEQ）由于其固有的简洁表述和出色的性能而构成了最重要和成功的研究领域。在本文中，我们采用有效的建模框架对语音特征的时空上下文统计进行联合均衡。最重要的是，我们探索了简单的微分和求平均值操作的各种组合，以在HEQ过程中呈现特征向量分量的上下文关系，不仅在不同维度之间，而且在连续语音帧之间。此外，研究了HEQ的几种变体，并将其集成到提出的建模框架中，以有效补偿噪声干扰对特征向量分量的影响。此外，从该框架推论出的方法的实用性和几种现有的鲁棒性方法也得到了广泛的分析和比较。所有实验均在Aurora-2数据库和任务上进行，并在Aurora-4数据库和任务上进行了进一步验证。实验结果表明，我们提出的方法可以在基线系统上提供实质性的改进，并且在语音识别方面可以与某些现有的噪声鲁棒性方法竞争甚至更好。

著录项

来源
《Multimedia Tools and Applications》 |2015年第17期|6769-6795|共27页
作者
Hsieh Hsin-Ju; Chen Berlin; Hung Jeih-weih;
展开▼
作者单位

Natl Taiwan Normal Univ, Dept Comp Sci & Informat Engn, Taipei, Taiwan|Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan;

Natl Taiwan Normal Univ, Dept Comp Sci & Informat Engn, Taipei, Taiwan;

Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic speech recognition; Noise robustness; Histogram equalization; Feature contextual statistics;

机译：自动语音识别;噪声鲁棒性;直方图均衡化;特征上下文统计;

相似文献

外文文献
中文文献
专利

1. Histogram equalization of speech representation for robust speech recognition [J] . de la Torre A., Peinado A.M., Segura J.C., IEEE Transactions on Speech and Audio Proceessing . 2005,第3期

机译：语音表示的直方图均衡化，可增强语音识别能力
2. Histogram equalization with Bayesian estimation for noise robust speech recognition [J] . Suh Youngjoo, Kim Hoirin The Journal of the Acoustical Society of America . 2018,第2期

机译：贝叶斯估计噪声鲁棒语音识别的直方图均衡
3. Stereo-based histogram equalization for robust speech recognition [J] . Randa Al-Wakeel, Mahmoud Shoman, Magdy Aboul-Ela, EURASIP journal on audio, speech, and music processing . 2015,第1期

机译：基于立体声的直方图均衡，可增强语音识别能力
4. Exploring Joint Equalization of Spatial-Temporal Contextual Statistics of Speech Features for Robust Speech Recognition [C] . Hsin-Ju Hsieh, Jeih-weih Hung, Berlin Chen Annual conference of the International Speech Communication Association . 2012

机译：探索语音特征的时空上下文统计的联合均衡，以实现可靠的语音识别
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. New Features Using Robust MVDR Spectrum of Filtered Autocorrelation Sequence for Robust Speech Recognition [O] . Sanaz Seyedin, Seyed Mohammad Ahadi, Saeed Gazor 2013

机译：使用滤波自相关序列的鲁棒MVDR频谱进行鲁棒语音识别的新功能
7. Histogram equalization of speech representation for robust speech recognition [O] . Ángel De La Torre, Antonio M. Peinado, José C. Segura, 2013

机译：语音表示的直方图均衡化，可增强语音识别能力

Histogram equalization of contextual statistics of speech features for robust speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅