Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

Yuuki Tachioka; Tomohiro Narita; Shinji Watanabe

首页> 外文期刊>EURASIP journal on advances in signal processing >Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

【24h】

Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

机译：去混响，特征转换，判别训练方法和系统组合方法在各种混响环境中的有效性

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The recently released REverberant Voice Enhancement and Recognition Benchmark (REVERB) challenge includes a reverberant automatic speech recognition (ASR) task. This paper describes our proposed system based on multi-channel speech enhancement preprocessing and state-of-the-art ASR techniques. For preprocessing, we propose a single-channel dereverberation method with reverberation time estimation, which is combined with multichannel beamforming that enhances direct sound compared with the reflected sound. In addition, this paper also focuses on state-of-the-art ASR techniques such as discriminative training of acoustic models including the Gaussian mixture model, subspace Gaussian mixture model, and deep neural networks, as well as various feature transformation techniques. Although, for the REVERB challenge, it is necessary to handle various acoustic environments, a single ASR system tends to be overly tuned for a specific environment, which degrades the performance in the mismatch environments. To overcome this mismatch problem with a single ASR system, we use a system combination approach using multiple ASR systems with different features and different model types because a combination of various systems that have different error patterns is beneficial. In particular, we use our discriminative training technique for system combination that achieves better generalization by making systems complementary with the modified discriminative criteria. Experiments show the effectiveness of these approaches, reaching 6.76 and 18.60 % word error rates on the REVERB simulated and real test sets. These are 68.8 and 61.5 % relative improvements over the baseline.

机译：最近发布的混响语音增强和识别基准（REVERB）挑战包括混响自动语音识别（ASR）任务。本文介绍了我们基于多通道语音增强预处理和最新ASR技术提出的系统。对于预处理，我们提出了一种带有混响时间估计的单通道去混响方法，该方法与多通道波束成形相结合，与反射声相比，增强了直接声。此外，本文还重点介绍了最新的ASR技术，例如对声学模型的判别训练，包括高斯混合模型，子空间高斯混合模型和深度神经网络，以及各种特征转换技术。尽管对于REVERB挑战，必须处理各种声学环境，但单个ASR系统往往针对特定环境进行了过度调音，这会降低不匹配环境的性能。为了克服单个ASR系统的不匹配问题，我们使用系统组合方法，使用具有不同功能和不同模型类型的多个ASR系统，因为将具有不同错误模式的各种系统组合在一起是有益的。特别是，我们将判别训练技术用于系统组合，该方法通过使系统与修改后的判别准则互补来实现更好的概括性。实验证明了这些方法的有效性，在REVERB模拟和真实测试集上达到了6.76和18.60％的字错误率。与基准相比，分别有68.8％和61.5％的相对改进。

著录项

来源
《EURASIP journal on advances in signal processing》 |2015年第1期|共页
作者
Yuuki Tachioka; Tomohiro Narita; Shinji Watanabe;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类通信;
关键词
Reverberant speech recognitionDereverberationDiscriminative trainingFeature transformationSystem combinationREVERB challenge;

机译：混响语音识别混响鉴别训练特征转换系统组合REVERB挑战;
入库时间 2022-08-18 10:18:31

相似文献

外文文献
中文文献
专利

1. Front-end technologies for robust ASR in reverberant environments—spectral enhancement-based dereverberation and auditory modulation filterbank features [J] . Feifei Xiong, Bernd T. Meyer, Niko Moritz, EURASIP journal on advances in signal processing . 2015,第1期

机译：在混响环境中增强鲁棒ASR的前端技术-基于频谱增强的混响和听觉调制滤波器组功能
2. A Blind Channel Identification-Based Two-Stage Approach to Separation and Dereverberation of Speech Signals in a Reverberant Environment [J] . Huang Y., Benesty J., Chen J. IEEE Transactions on Speech and Audio Proceessing . 2005,第5期

机译：基于盲通道识别的两阶段混响环境中语音信号分离与去混响方法
3. Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training [J] . Makoto SAKAI, Norihide KITAOKA, Yuya HATTORI, IEICE transactions on information and systems . 2010,第2期

机译：基于判别分析的声学特征转换和判别训练的组合使用评估
4. Effectiveness of discriminative training and feature transformation for reverberated and noisy speech [C] . Tachioka Yuuki, Watanabe Shinji, Hershey John R. IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：区分训练和特征转换对回响和嘈杂语音的有效性
5. E-learning effectiveness: An examination of online training methods for training end-users of new technology systems. [D] . Esch, Thomas J. 2003

机译：电子学习有效性：对培训新技术系统最终用户的在线培训方法的检查。
6. A cross-disciplinary mixed-method approach to understand how food retail environment transformations influence food choice and intake among the urban poor: Experiences from Vietnam [O] . Sigrid C.O. Wertheim-Heck, Jessica E. Raneri -1

机译：一种跨学科的混合方法了解食品零售环境的变化如何影响城市贫困人口的食品选择和摄取：越南的经验
7. Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments [O] . Yuuki Tachioka, Tomohiro Narita, Shinji Watanabe 2015

机译：混响效果，特征转换，判别训练方法和系统组合方法在各种混响环境中的有效性

Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

摘要

著录项

相似文献

相关主题

期刊订阅