Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition

Shao Y.; Chang C.-H.

首页> 外文期刊>Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on >Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition

【24h】

Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition

机译：贝叶斯分离与稀疏度提升的感知小波域中的语音增强和混合语音识别

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech recognition accuracy can be improved by the removal of noise. However, errors in the estimated signal components can also obscure the recognition. This paper presents a framework of wavelet-based techniques to harness the automatic speech recognition performance in the presence of background noise. The proposed robust speech recognition system is realized by implementing speech enhancement preprocessing, feature extraction, and a hybrid speech recognizer in the time–frequency space. A perceptual wavelet filterbank using a fixed base to imitate the human perceptual modus of speech is developed to capture the most discriminative information in the time–frequency plane. To minimize the mismatch between the training and testing conditions of the classifier, a Bayesian scheme is applied in a wavelet domain to separate the speech and noise components in the proposed iterative speech enhancement algorithm. The nonphonetic information is discarded while the more critical speech features are extracted and represented by the wavelet coefficients. The denoised wavelet features are fed to the hybrid classifier founded on a hidden Markov model (HMM). The intrinsic limitation of the HMM is overcome by augmenting it with a wavelet support vector machine. This hybrid and hierarchical design paradigm improves the recognition performance by combining the advantages of different methods into an integral system. The continuous digit speech recognition experiments conducted with the proposed framework show promising results. It significantly improves the recognition performance at a low signal-to-noise ratio (SNR) without causing a poorer performance at a high SNR.

机译：通过去除噪声可以提高语音识别的准确性。但是，估计信号分量中的错误也会使识别模糊。本文提出了一种基于小波的技术框架，以在存在背景噪声的情况下利用自动语音识别性能。所提出的鲁棒语音识别系统是通过在时频空间中实现语音增强预处理，特征提取和混合语音识别器来实现的。开发了一种使用固定碱基来模仿人类感知语音方式的感知小波滤波器组，以捕获时频平面中最具判别力的信息。为了最小化分类器的训练条件和测试条件之间的不匹配，在提出的迭代语音增强算法中，在小波域中应用贝叶斯方案来分离语音和噪声分量。非语音信息被丢弃，而更关键的语音特征被提取并由小波系数表示。去噪的小波特征被馈送到基于隐马尔可夫模型（HMM）的混合分类器。 HMM的固有局限性是通过使用小波支持向量机对其进行增强来克服的。通过将不同方法的优点组合到一个完整的系统中，这种混合和分层设计范例提高了识别性能。提出的框架进行的连续数字语音识别实验显示出令人鼓舞的结果。它可显着提高低信噪比（SNR）时的识别性能，而不会导致高SNR时性能变差。

著录项

来源
《Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on》 |2011年第2期|p.284-293|共10页
作者
Shao Y.; Chang C.-H.;
展开▼
作者单位

Software Exploration R&D Center, Bureau of Geophysical Prospecting Inc., China National Petroleum Corporation, Zhuozhou, China;

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Bayesian theory; hidden Markov model (HMM); speech enhancement; speech recognition; support vector machine (SVM); wavelet transform;

机译：贝叶斯理论;隐马尔可夫模型（HMM）;语音增强;语音识别;支持向量机（SVM）;小波变换;

相似文献

外文文献
中文文献
专利

1. Speech enhancement using sparse dictionary learning in wavelet packet transform domain [J] . Samira Mavaddaty, Seyed Mohammad Ahadi, Sanaz Seyedin Computer speech and language . 2017,第JULa期

机译：小波包变换域中基于稀疏字典学习的语音增强
2. Combination of GMM-Based Speech Estimation Method and Temporal Domain SVD-Based Speech Enhancement for Noise Robust Speech Recognition [J] . Masakiyo Fujimoto, Yasuo Ariki Systems and Computers in Japan . 2007,第3期

机译：基于GMM的语音估计方法与基于时域SVD的语音增强相结合的噪声鲁棒语音识别
3. Perceptual integration between target speech and target-speech reflection reduces masking for target-speech recognition in younger adults and older adults. [J] . Huang Y, Huang Q, Chen X, Hearing Research: An International Journal . 2008,第1a2期

机译：目标语音和目标语音反射之间的感知整合减少了年轻人和老年人对目标语音识别的掩蔽。
4. Noise robust speech recognition by combining speech enhancement in the wavelet domain and Lin-log RASTA [C] . ISECS International Colloquium on Computing, Communication, Control, and Management . 2009

机译：通过与小波域和LIN-LOG Rasta中的语音增强结合语音增强噪声鲁棒语音识别
5. Speech enhancement based on perceptual loudness and statistical models of speech. [D] . Zhang, Wei. 2009

机译：基于感知响度和语音统计模型的语音增强。
6. The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance [O] . Ming Tu, Alan Wisler, Visar Berisha, -1

机译：构音障碍性听觉障碍与自动语音识别性能的关系
7. BAYESIAN SINGLE CHANNEL SPEECH ENHANCEMENT EXPLOITING SPARSENESS IN THE ICA DOMAIN [O] . Hong Liang, Rosca Justinian, Balan Radu 2004

机译：贝叶斯单通道语音增强技术消除ICA域中的稀疏性

Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅