HYBRID ACOUSTIC MODELS FOR DISTANT AND MULTICHANNEL LARGE VOCABULARY SPEECH RECOGNITION

机译：遥远和多通道大词汇语音识别的混合声学模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We investigate the application of deep neural network (DNN)-hidden Markov model (HMM) hybrid acoustic models for far-field speech recognition of meetings recorded using microphone arrays. We show that the hybrid models achieve significantly better accuracy than conventional systems based on Gaussian mixture models (GMMs). We observe up to 8% absolute word error rate (WER) reduction from a discrimina-tively trained GMM baseline when using a single distant microphone, and between 4-6% absolute WER reduction when using beamforming on various combinations of array channels. By training the networks on audio from multiple channels, we find the networks can recover significant part of accuracy difference between the single distant microphone and beamformed configurations. Finally, we show that the accuracy of a network recognising speech from a single distant microphone can approach that of a multi-microphone setup by training with data from other microphones.

机译：我们研究了使用麦克风阵列记录的麦克风识别的远场语音识别的深神经网络（DNN）混合声学模型的应用。我们表明，混合模型比基于高斯混合模型（GMMS）的传统系统实现了明显的准确性。当使用单个远距离麦克风时，我们观察到最多8％的绝对字错误率（WER）减少，以及在阵列通道的各种组合上使用波束形成时，在4-6％的绝对WER减小之间。通过从多个通道培训音频上的网络，我们发现网络可以在单个远程麦克风和波束成形配置之间恢复精度差异的重要部分。最后，我们表明，通过使用来自其他麦克风的数据训练，网络识别来自单个远处麦克风的语音的准确性可以接近多麦克风设置。

著录项

来源
《Workshop on Automatic Speech Recognition and Understanding》|2013年||共6页
会议地点
作者
Pawel Swietojanski; Arnab Ghoshal; Steve Renals;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912.3-532;
关键词
Distant Speech Recognition; Deep Neural Networks; Microphone Arrays; Beamforming; Meeting recognition;

机译：遥远的语音识别;深神经网络;麦克风阵列;波束成形;会议识别;

相似文献

外文文献
中文文献
专利

1. Building DNN acoustic models for large vocabulary speech recognition [J] . Andrew L. Maas, Peng Qi, Ziang Xie, Computer speech and language . 2017,第jana期

机译：建立用于大词汇量语音识别的DNN声学模型
2. A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition [J] . Li Xiangang, Yang Yuning, Pang Zaihu, Neurocomputing . 2015,第deca25期

机译：基于大词汇量中文语音识别的深度神经网络中声学建模单元选择的比较研究
3. Boosting HMM acoustic models in large vocabulary speech recognition [J] . Meyer C, Schramm H Speech Communication . 2006,第5期

机译：在大词汇量语音识别中增强HMM声学模型
4. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition [C] . Swietojanski Pawel, Ghoshal Arnab, Renals Steve IEEE Workshop on Automatic Speech Recognition and Understanding . 2013

机译：用于远距离和多通道大词汇量语音识别的混合声学模型
5. Robust Acoustic Modeling and Front-End Design for Distant Speech Recognition [D] . Mirsamadi, Seyedmahdad. 2017

机译：鲁棒的声学建模和远端语音识别前端设计
6. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition [O] . Edvin Pakoci, Branislav Popović, Darko Pekar 2019

机译：在塞尔维亚大型词汇语音识别的语言建模中使用形态学数据
7. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition [O] . Swietojanski, P, Ghoshal, A., Renals, S. 2013

机译：用于远程和多声道大词汇量语音识别的混合声学模型

HYBRID ACOUSTIC MODELS FOR DISTANT AND MULTICHANNEL LARGE VOCABULARY SPEECH RECOGNITION

摘要

著录项

相似文献

相关主题

期刊订阅