ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

Shigeki MATSUDA; Takatoshi JITSUHIRO; Konstantin MARKOV; Satoshi NAKAMURA

首页> 外文期刊>IEICE Transactions on Information and Systems >ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

【24h】

ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

机译：基于ATR并行解码的语音识别系统对噪声和说话风格均具有鲁棒性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we describe a parallel decoding-based ASR system developed of ATR that is robust to noise type, SNR and speaking style. It is difficult to recognize speech affected by various factors, especially when an ASR system contains only a single acoustic model. One solution is to employ multiple acoustic models, one model for each different condition. Even though the robustness of each acoustic model is limited, the whole ASR system can handle various conditions appropriately. In our system, there are two recognition sub-systems which use different features such as MFCC and Differential MFCC (DMFCC). Each sub-system has several acoustic models depending on SNR, speaker gender and speaking style, and during recognition each acoustic model is adapted by fast noise adaptation. From each sub-system, one hypothesis is selected based on posterior probability. The final recognition result is obtained by combining the best hypotheses from the two sub-systems. On the AURORA-2J task used widely for the evaluation of noise robustness, our system achieved higher recognition performance than a system which contains only a single model. Also, our system was tested using normal and hyper-articulated speech contaminated by several background noises, and exhibited high robustness to noise and speaking styles.

机译：在本文中，我们描述了一种针对ATR开发的基于并行解码的ASR系统，该系统对噪声类型，SNR和说话风格具有鲁棒性。很难识别受各种因素影响的语音，尤其是当ASR系统仅包含一个声学模型时。一种解决方案是采用多个声学模型，每个模型针对每种不同条件。即使每个声学模型的鲁棒性受到限制，整个ASR系统仍可以适当处理各种条件。在我们的系统中，有两个识别子系统，它们使用不同的功能，例如MFCC和差分MFCC（DMFCC）。每个子系统都有多个声学模型，具体取决于SNR，说话者性别和说话风格，并且在识别过程中，每个声学模型都通过快速噪声适应进行了适应。从每个子系统中，根据后验概率选择一个假设。最终的识别结果是通过结合两个子系统的最佳假设而获得的。在广泛用于评估噪声鲁棒性的AURORA-2J任务上，我们的系统比仅包含一个模型的系统获得了更高的识别性能。此外，我们的系统使用正常的和超清晰的语音进行了测试，这些语音被多种背景噪音所污染，并且对噪音和说话风格表现出很高的鲁棒性。

著录项

来源
《IEICE Transactions on Information and Systems》 |2006年第3期|p.989-997|共9页
作者
Shigeki MATSUDA; Takatoshi JITSUHIRO; Konstantin MARKOV; Satoshi NAKAMURA;
展开▼
作者单位

ATR Spoken Language Communication Research Laboratories, Kyoto-fu, 619-0288 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
automatic speech recognition; parallel decoding; multiple acoustic models; fast noise adaptation; speaking style; hyper-articulated speech;

机译：自动语音识别;并行解码;多种声学模型;快速噪声适应;说话风格;高清晰度语音;
入库时间 2022-08-18 00:28:59

相似文献

外文文献
中文文献
专利

1. An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems [J] . Seiichi NAKAGAWA, Tomohiro WATANABE, Hiromitsu NISHIZAKI, IEICE Transactions on Information and Systems . 2005,第3期

机译：基于多重识别系统的演讲风格自发语音识别的无监督说话人自适应方法
2. Combination of GMM-Based Speech Estimation Method and Temporal Domain SVD-Based Speech Enhancement for Noise Robust Speech Recognition [J] . Masakiyo Fujimoto, Yasuo Ariki Systems and Computers in Japan . 2007,第3期

机译：基于GMM的语音估计方法与基于时域SVD的语音增强相结合的噪声鲁棒语音识别
3. Binaural Classification-Based Speech Segregation and Robust Speaker Recognition System [J] . Venkatesan R., Ganesh A. Balaji Circuits, systems, and signal processing . 2018,第8期

机译：基于双分类的语音分离和健壮的说话人识别系统
4. Speech Recognition System Robust to Noise and Speaking Styles [C] . Shigeki Matsuda, Takatoshi Jitsuhiro, Konstantin Markov, International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) . 2004

机译：语音识别系统对噪声和说话风格具有鲁棒性
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Threshold-Based Noise Detection and Reduction for Automatic Speech Recognition System in Human-Robot Interactions [O] . Sheng-Chieh Lee, Jhing-Fa Wang, Miao-Hia Chen 2018

机译：人机交互中基于阈值的自动语音识别系统噪声检测与消减
7. Analysis of Unsupervised and Noise-Robust Speaker-Adaptive HMM-Based Speech Synthesis Systems toward a Unified ASR and TTS Framework [O] . Yamagishi Junichi, Lincoln Mike, King Simon, 2009

机译：面向统一ASR和TTS框架的无监督且噪声强的基于说话人自适应HMM的语音合成系统分析

ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

摘要

著录项

相似文献

相关主题

期刊订阅