Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures

Moore Alastair H.; Peso Parada Pablo; Naylor Patrick A.

首页> 外文期刊>Computer speech and language >Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures

【24h】

Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures

机译：语音增强功能可实现强大的自动语音识别：使用基准系统和仪器测量进行评估

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Automatic speech recognition in everyday environments must be robust to significant levels of reverberation and noise. One strategy to achieve such robustness is multi-microphone speech enhancement. In this study, we present results of an evaluation of different speech enhancement pipelines using a state-of-the-art ASR system for a wide range of reverberation and noise conditions. The evaluation exploits the recently released ACE Challenge database which includes measured multichannel acoustic impulse responses from 7 different rooms with reverberation times ranging from 0.33 to 1.34 s. The reverberant speech is mixed with ambient, fan and babble noise recordings made with the same microphone setups in each of the rooms. In the first experiment, performance of the ASR without speech processing is evaluated. Results clearly indicate the deleterious effect of both noise and reverberation. In the second experiment, different speech enhancement pipelines are evaluated with relative word error rate reductions of up to 82%. Finally, the ability of selected instrumental metrics to predict ASR performance improvement is assessed. The best performing metric, Short-Time Objective Intelligibility Measure, is shown to have a Pearson correlation coefficient of 0.79, suggesting that it is a useful predictor of algorithm performance in these tests.

机译：日常环境中的自动语音识别必须对很大程度的混响和噪声具有鲁棒性。实现这种鲁棒性的一种策略是多麦克风语音增强。在这项研究中，我们介绍了使用最先进的ASR系统针对各种混响和噪声条件评估不同语音增强管道的结果。该评估利用了最近发布的ACE Challenge数据库，该数据库包括来自7个不同房间的混响时间范围为0.33到1.34 s的测量多通道声脉冲响应。混响的语音与在每个房间中使用相同的麦克风设置进行的环境，风扇和ba语噪声记录混合在一起。在第一个实验中，评估了没有语音处理的ASR性能。结果清楚地表明了噪声和混响的有害影响。在第二个实验中，评估了不同的语音增强流水线，相对词错误率降低了多达82％。最后，评估所选工具指标预测ASR性能改善的能力。表现最好的度量标准，即“短时客观清晰度”，其Pearson相关系数为0.79，表明它是这些测试中算法性能的有用预测指标。

著录项

来源
《Computer speech and language》 |2017年第11期|574-584|共11页
作者
Moore Alastair H.; Peso Parada Pablo; Naylor Patrick A.;
展开▼
作者单位

Department of Electrical and Electronic Engineering, Imperial College London, Exhibition Road, London, United Kingdom;

Department of Electrical and Electronic Engineering, Imperial College London, Exhibition Road, London, United Kingdom,Cirrus Logic, Marble Arch House, 66 Seymour St., 1st Floor, London, United Kingdom;

Department of Electrical and Electronic Engineering, Imperial College London, Exhibition Road, London, United Kingdom;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic speech recognition; Beamforming; Dereverberation; Microphone array signal processing; Realistic environments; Speech enhancement;

机译：自动语音识别;波束成形去混响;麦克风阵列信号处理;现实环境;语音增强;

相似文献

外文文献
中文文献
专利

1. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF信息波束形成的无监督语音增强技术，用于强噪声自动语音识别
2. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF的噪声强度自动语音识别的无监督语音增强
3. Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition [J] . Moritz Niko, Adiloǧlu Kamil, Anemüller Jörn, Computer speech and language . 2017,第nova期

机译：噪声鲁棒自动语音识别的多通道语音增强和幅度调制分析
4. Comparative evaluation of speech enhancement methods for robust automatic speech recognition [C] . 4th International Conference on Signal Processing and Communication Systems . 2010

机译：鲁棒自动语音识别中语音增强方法的比较评估
5. Advances in Audiovisual Speech Processing for Robust Voice Activity Detection and Automatic Speech Recognition [D] . Tao, Fei. 2018

机译：用于鲁棒语音活动检测和自动语音识别的视听语音处理方面的进展
6. Towards spoken clinical-question answering: evaluating and adapting automatic speech-recognition systems for spoken clinical questions [O] . Feifan Liu, Gokhan Tur, Dilek Hakkani-Tür, 2011

机译：走向口语临床问题的答案：针对口语临床问题评估和改编自动语音识别系统
7. Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures [O] . Moore, AH, Peso, P, Naylor, PA 2016

机译：用于稳健的自动语音识别的语音增强：使用基线系统和仪器测量的评估

Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅