Histogram-based subband powerwarping and spectral averaging for robust speech recognition under matched and multistyle training

机译：基于直方图的子带功率变形和频谱平均，可在匹配和多样式训练下实现鲁棒的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes a new algorithm that increases the robustness of speech recognition systems by matching the power histograms of the input in each frequency band to those obtained over clean training data, and then mixing together the processed and unprocessed spectra. Before calculating prototype histograms over the training data, the power signals in each channel are normalized by the local maximum and minimum of the channel. In contrast, histograms calculated over the testing data are normalized by the global maximum and minimum of the power spectrum. This mode of normalization leads to a significant reduction in noise. Following the histogram-based processing, it is shown that taking a weighted average between the processed and unprocessed power spectra contributes to further gains in recognition accuracy. Results are obtained for multiple speech recognition systems, noise types, and training conditions illustrating the broad utility of this approach.

机译：本文介绍了一种新算法，该算法通过将每个频带中输入的功率直方图与通过干净训练数据获得的功率直方图进行匹配，然后将已处理和未处理的频谱混合在一起，从而提高语音识别系统的鲁棒性。在根据训练数据计算原型直方图之前，通过通道的局部最大值和最小值对每个通道中的功率信号进行归一化。相反，根据测试数据计算出的直方图通过功率谱的全局最大值和最小值进行归一化。这种归一化模式可显着降低噪声。在基于直方图的处理之后，显示出在已处理和未处理的功率谱之间进行加权平均有助于进一步提高识别精度。针对多种语音识别系统，噪声类型和训练条件获得了结果，说明了该方法的广泛用途。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP》|2012年|p.4697- 4700|共4页
会议地点 Kyoto(JP)
作者
Harvilla, Mark J.;
展开▼
作者单位

Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh PA 15213 USA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Robust speech recognition in noisy environments based on subband spectral centroid histograms [J] . Gajic B., Paliwal K.K. IEEE transactions on audio, speech and language processing . 2006,第2期

机译：基于子带频谱质心直方图的嘈杂环境中的鲁棒语音识别
2. Speaker normalized spectral subband parameters for noise robust speech recognition [J] . Satoru Tsuge, Toshiaki Fukada, Harald Singer, The Journal of the Acoustical Society of Japan . 1999,第6期

机译：扬声器归一化频谱子带参数，用于噪声鲁棒的语音识别
3. Robust speech recognition training via duration and spectral-basedstress token generation [J] . Hansen J.H.L., Bou-Ghazale S.E. IEEE Transactions on Speech and Audio Proceessing . 1995,第5期

机译：通过持续时间和基于频谱的应力标记生成可靠的语音识别训练
4. Histogram-based subband powerwarping and spectral averaging for robust speech recognition under matched and multistyle training [C] . Harvilla M.J., Stern R.M. IEEE International Conference on Acoustics, Speech and Signal Processing . 2011

机译：基于直方图的子带PowerWarping和匹配和多层培训的强大语音识别的光谱平均
5. Compressive nonlinearity for representing speech spectral magnitude to improve noise robustness of automatic speech recognition . [D] . Wong, Brian. 2011

机译：压缩非线性表示语音频谱幅度提高语音自动识别的鲁棒性。
6. The Effect of Training Rate on Recognition of Spectrally Shifted Speech [O] . Geraldine Nogaki, Qian-Jie Fu, John J Galvin III -1

机译：训练速率对谱移语音识别的影响
7. HISTOGRAM-BASED SUBBAND POWER WARPING AND SPECTRAL AVERAGING FOR ROBUST SPEECH RECOGNITION UNDER MATCHED AND MULTISTYLE TRAINING [O] . Mark J. Harvilla, Richard M. Stern 2012

机译：匹配和多风格训练下基于直方图的SUBBRAND WARP和光谱平均用于鲁棒语音识别

Histogram-based subband powerwarping and spectral averaging for robust speech recognition under matched and multistyle training

摘要

著录项

相似文献

相关主题

期刊订阅