Multistream Bandpass Modulation Features for Robust Speech Recognition

机译：多流带通调制功能可实现可靠的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current understanding of speech processing in the brain suggests dual streams of processing of temporal and spectral information, whereby slow vs. fast modulations are analyzed along parallel paths that encode various scales of information in speech signals. This unique way for the biology to analyze the multiplicity of information in speech signals along parallel paths can bare great lessons for feature extraction front-ends in speech processing systems, particularly for dealing with extrinsic degradations and unseen noise distortions. Here, we propose a multistream approach to feature analysis for robust speaker-independent phoneme recognition in presence of nonstationary background noises. The scheme presented here centers around a multi-path bandpass modulation analysis of speech sounds with each stream covering an entire range of temporal and spectral modulations. By performing bandpass operations of slow vs. fast information along the spectral and temporal dimensions, the proposed scheme avoids the classic feature explosion problem of previous multistream approaches while maintaining the advantage of parallelism and localized feature analysis. The proposed architecture results in substantial improvements over standard baseline features and two state-of-the-art noise robust feature schemes.

机译：当前对大脑中语音处理的理解提出了时间和频谱信息处理的双重流，从而沿编码语音信号中各种信息量的并行路径分析了慢速调制与快速调制。这种独特的生物学方法可以分析语音信号沿并行路径的多样性，这可以为语音处理系统中的特征提取前端（特别是处理外部退化和看不见的噪声失真）提供重要的经验教训。在这里，我们提出了一种多流方法来进行特征分析，以在存在非平稳背景噪声的情况下实现健壮的独立于说话人的音素识别。这里介绍的方案围绕语音的多路径带通调制分析，每个流覆盖整个时间和频谱调制范围。通过沿频谱和时间维度执行慢速信息与快速信息的带通操作，所提出的方案避免了先前多流方法的经典特征爆炸问题，同时保持了并行性和局部特征分析的优势。所提出的体系结构对标准基线特征和两个最新的噪声健壮特征方案进行了实质性的改进。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.1284-1287|共4页
会议地点
作者
Sridhar Krishna Nemala; Kailash Patil; Mounya Elhilali;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
multistream; specto-temporal modulations; speech recognition; noise robustness;

机译：多流频谱时间调制;语音识别;噪声鲁棒性;

相似文献

外文文献
中文文献
专利

1. Multistream sparse representation features for noise robust audio-visual speech recognition [J] . Peng Shen, Satoru Hayamizu, Satoshi Tamura Acoustical science and technology . 2014,第1期

机译：多流稀疏表示功能可实现强大的抗噪视听语音识别
2. Multistream Articulatory Feature-Based Models for Visual Speech Recognition [J] . Saenko Kate, Livescu Karen, Glass James, Pattern Analysis and Machine Intelligence, IEEE Transactions on . 2009,第9期

机译：基于多流发音特征的视觉语音识别模型
3. Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition [J] . Marc René Sch?dler, Bernd T. Meyer, Birger Kollmeier The Journal of the Acoustical Society of America . 2012,第5期

机译：频谱时间调制子空间跨度滤波器组功能，用于强大的自动语音识别
4. Enhancing the sub-band modulation spectra of speech features via nonnegative matrix factorization for robust speech recognition [C] . Fan Hao-teng, Tsai Yi-chang, Hung Jeih-weih System Science and Engineering (ICSSE), 2012 International Conference on . 2012

机译：通过非负矩阵分解增强语音特征的子带调制频谱，以实现可靠的语音识别
5. A Practical and Efficient Multistream Framework for Noise Robust Speech Recognition [D] . Mallidi, Sri Harish. 2018

机译：实用高效的多流噪声鲁棒语音识别框架
6. A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition [O] . Sridhar Krishna Nemala, Kailash Patil, Mounya Elhilali -1

机译：在带通滤波调制多流功能根据框架鲁棒语音识别
7. Multistream sparse representation features for noise robust audio-visual speech recognition [O] . Peng Shen, Satoshi Tamura, Satoru Hayamizu 2014

机译：MultiStream稀疏表示功能，用于噪声强大的视听语音语音识别
8. Normalized Amplitude Modulation Features for Large Vocabulary Noise- Robust Speech Recognition. [R] . Mitra, V., Franco, H., Graciarena, M., 2012

机译：用于大词汇量噪声 - 鲁棒语音识别的归一化幅度调制特征。

Multistream Bandpass Modulation Features for Robust Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅