Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition

机译：使用深度神经网络瓶颈特征进行学习特征映射以实现远距离大词汇量语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic speech recognition from distant microphones is a difficult task because recordings are affected by reverberation and background noise. First, the application of the deep neural network (DNN)/hidden Markov model (HMM) hybrid acoustic models for distant speech recognition task using AMI meeting corpus is investigated. This paper then proposes a feature transformation for removing reverberation and background noise artefacts from bottleneck features using DNN trained to learn the mapping between distant-talking speech features and close-talking speech bottleneck features. Experimental results on AMI meeting corpus reveal that the mismatch between close-talking and distant-talking conditions is largely reduced, with about 16% relative improvement over conventional bottleneck system (trained on close-talking speech). If the feature mapping is applied to close-talking speech, a minor degradation of 4% relative is observed.

机译：来自远处麦克风的自动语音识别是一项艰巨的任务，因为录音会受到混响和背景噪声的影响。首先，研究了深度神经网络（DNN）/隐马尔可夫模型（HMM）混合声学模型在使用AMI会议语料的远程语音识别任务中的应用。然后，本文提出了一种特征变换，该特征变换使用DNN进行训练，以消除瓶颈特征中的混响和背景噪声伪像，该DNN可以学习远距离语音特征和近距离语音瓶颈特征之间的映射。在AMI会议语料库上的实验结果表明，近距离交谈和远距离交谈条件之间的失配已大大降低，与传统的瓶颈系统（在近距离交谈语音上进行训练）相比，相对改善了约16％。如果将特征映射应用于近距离讲话，则会观察到相对降低4％。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2015年|4540-4544|共5页
会议地点
作者
Himawan Ivan; Motlicek Petr; Imseng David; Potard Blaise; Kim Namhoon; Lee Jaewon;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
AMI corpus; Deep neural network; bottleneck features; distant speech recognition; meetings;

机译：AMI语料库;深度神经网络;瓶颈特征;远程语音识别;会议;

相似文献

外文文献
中文文献
专利

1. Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation [J] . Xiong Xiao, Shengkui Zhao, Duc Hoang Ha Nguyen, EURASIP journal on advances in signal processing . 2016,第1期

机译：使用动态特征增强和识别的语音去混响约束深度神经网络和特征自适应
2. Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification [J] . Zhaofeng Zhang, Longbiao Wang, Atsuhiko Kai, EURASIP journal on audio, speech, and music processing . 2015,第1期

机译：基于深度神经网络的瓶颈特征和基于去噪自动编码器的去混响用于远距离说话者识别
3. Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature [J] . Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara EURASIP journal on advances in signal processing . 2015,第1期

机译：结合了深度神经网络和深度自动编码器的混响语音识别，并增强了电话类功能
4. Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition [C] . I. Himawan, P. Motlicek, D. Imseng, IEEE International Conference on Acoustics, Speech and Signal Processing . 2015

机译：学习功能映射，使用深神经网络瓶颈特征，但远程大词汇语音识别
5. Internal and External Feature Engineering Applied to Deep Learning with Convolutional Neural Networks for Monocular Relative Pose Estimation in Visual Odometry and Self-Localization [D] . Parkins, Franz Payton. 2020

机译：内部和外部特征工程应用于卷积神经网络的深度学习，用于视觉测量和自定位中的单眼相对姿态估计
6. An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition [O] . Alicia Lozano-Diez, Ruben Zazo, Doroteo T. Toledano, -1

机译：深度神经网络（DNN）拓扑对基于瓶颈特征的语言识别的影响分析
7. Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition [O] . Aditya Arie Nugraha, Kazumasa Yamamoto, Seiichi Nakagawa 2014

机译：通过使用级联神经网络的特征映射进行单声道去混响，以实现可靠的远距离说话者识别和语音识别

Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅