Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution

Yongjoo Chung; John HL Hansen

首页> 外文期刊>EURASIP Journal on Audio, Speech, and Music Processing >Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution

【24h】

Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution

机译：使用基于环境嗅探的语音识别解决方案补偿SNR和噪声类型不匹配

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multiple-model based speech recognition (MMSR) has been shown to be quite successful in noisy speech recognition. Since it employs multiple hidden Markov model (HMM) sets that correspond to various noise types and signal-to-noise ratio (SNR) values, the selected acoustic model can be closely matched with the test noisy speech, which leads to improved performance when compared with other state-of-the-art speech recognition systems that employ a single HMM set. However, as the number of HMM sets is usually limited due to practical considerations as well as effective model selection, acoustic mismatch can still be a problem in MMSR. In this study, we proposed methods to improve recognition performance by mitigating the mismatch in SNR and noise type for an MMSR solution. For the SNR mismatch, an optimal SNR mapping between the test noisy speech and the HMM was determined by experimental investigation. Improved performance was demonstrated by employing the SNR mapping instead of using the estimated SNR of the test noisy speech directly. We also proposed a novel method to reduce the effect of noise type mismatch by compensating the test noisy speech in the log-spectrum domain. We first derive the relation between the log-spectrum vectors in the test and training noisy speech. Since the relation is a non-linear function of the speech and noise parameters, the statistical information regarding the testing log-spectrum vectors was obtained by approximation using vector Taylor series (VTS) algorithm. Finally, the minimum mean square error estimation of the training log-spectrum vectors was used to reduce the mismatch between the training and test noisy speech. By employing the proposed methods in the MMSR framework, relative word error rate reduction of 18.7% and 21.3% was achieved on the Aurora 2 task when compared to a conventional MMSR and multi-condition training (MTR) method, respectively.

机译：基于多模型的语音识别（MMSR）已被证明在嘈杂的语音识别中非常成功。由于它采用了与各种噪声类型和信噪比（SNR）值相对应的多个隐马尔可夫模型（HMM）集，因此所选声学模型可以与测试噪声语音紧密匹配，从而在进行比较时可以提高性能以及其他采用单个HMM集的最新语音识别系统。但是，由于出于实际考虑以及有效的模型选择，通常会限制HMM集的数量，因此声学失配仍然是MMSR中的问题。在这项研究中，我们提出了通过减少MMSR解决方案的SNR和噪声类型的不匹配来提高识别性能的方法。对于SNR失配，通过实验研究确定了测试嘈杂语音与HMM之间的最佳SNR映射。通过采用SNR映射而不是直接使用测试带噪语音的估计SNR可以证明性能得到了改善。我们还提出了一种新方法，通过在对数谱域中补偿测试噪声语音来减少噪声类型不匹配的影响。我们首先导出测试中对数谱向量与训练有声语音之间的关系。由于该关系是语音和噪声参数的非线性函数，因此使用矢量泰勒级数（VTS）算法通过近似获得有关测试对数谱向量的统计信息。最后，使用训练对数谱向量的最小均方误差估计来减少训练和测试噪声语音之间的不匹配。通过在MMSR框架中采用建议的方法，与常规的MMSR和多条件训练（MTR）方法相比，Aurora 2任务的相对单词错误率分别降低了18.7％和21.3％。

著录项

来源
《EURASIP Journal on Audio, Speech, and Music Processing》 |2013年第1期|1-14|共14页
作者
Yongjoo Chung; John HL Hansen;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech recognition Multiple-model frame Noise robustness Environmental sniffing;

机译：语音识别多模型帧噪声鲁棒性环境监听;

相似文献

外文文献
中文文献
专利

1. Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution [J] . Yongjoo Chung, John HL Hansen EURASIP journal on audio, speech, and music processing . 2013,第1期

机译：使用基于环境嗅探的语音识别解决方案补偿SNR和噪声类型不匹配
2. Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR [J] . Xiaodong Cui, Alwan A. IEEE Transactions on Speech and Audio Proceessing . 2005,第6期

机译：基于发声信噪比多项式回归的特征补偿噪声鲁棒语音识别
3. Noise speech recognition based on robust features and a model-based noise compensation evaluated on aurora-2 task [J] . Kaisheng Yao, Jingdong Chen, Kuldip K. Paliwal, 電子情報通信学会技術研究報告. 音声. Speech . 2001,第522期

机译：基于Aurora-2任务评估的基于鲁棒功能的噪声语音识别和基于模型的噪声补偿
4. Noise Robust Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation [C] . Chung Yong-Joo International Conference on Advanced Computer Science Applications and Technologies . 2014

机译：基于语音特征补偿的基于自适应HMM的鲁棒语音识别
5. Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition. [D] . Harvilla, Mark J. 2014

机译：噪声中的非线性失真补偿，用于鲁棒的语音识别。
6. Speech-in-Noise Test results of compensation claimants for noise induced hearing loss in Korean male workers: Words-in-Noise Test (WIN) and quick-Hearing-in-Noise Test (HINT) [O] . Ji Soo Kim, Joong Keun Kwon, Nam Jeong Kim, 2021

机译：韩国男性工人噪声引起的噪声诱导损失的噪音索赔人的语音测试结果：单词 - 噪声测试（WIN）和快速听音 - 噪音测试（提示）
7. Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution [O] . Yongjoo Chung, John HL Hansen 2013

机译：使用基于环境嗅探的语音识别解决方案补偿SNR和噪声类型不匹配
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution

摘要

著录项

相似文献

相关主题

期刊订阅