Joint acoustic factor learning for robust deep neural network based automatic speech recognition

机译：基于鲁棒深度神经网络的自动语音识别联合声学因子学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep neural networks (DNNs) for acoustic modeling have been shown to provide impressive results on many state-of-the-art automatic speech recognition (ASR) applications. However, DNN performance degrades due to mismatches in training and testing conditions and thus adaptation is necessary. In this paper, we explore the use of discriminative auxiliary input features obtained using joint acoustic factor learning for DNN adaptation. These features are derived from a bottleneck (BN) layer of a DNN and are referred to as BN vectors. To derive these BN vectors, we explore the use of two types of joint acoustic factor learning which capture speaker and auxiliary information such as noise, phone and articulatory information of speech. In this paper, we show that these BN vectors can be used for adaptation and thereby improve the performance of an ASR system. We also show that the performance can be further improved on augmenting these BN vectors to conventional i-vectors. In this paper, experiments are performed on Aurora-4, REVERB challenge and AMI databases.

机译：已经证明，用于声学建模的深度神经网络（DNN）可在许多最新的自动语音识别（ASR）应用中提供令人印象深刻的结果。但是，由于训练和测试条件的不匹配，DNN性能会下降，因此需要进行调整。在本文中，我们探索使用通过联合声学因子学习获得的可判别辅助输入特征进行DNN自适应。这些特征源自DNN的瓶颈（BN）层，被称为BN向量。为了导出这些BN向量，我们探索了两种类型的联合声学因子学习的使用，它们可以捕获说话者和辅助信息，例如噪声，电话和语音的发音信息。在本文中，我们表明这些BN向量可用于自适应，从而提高ASR系统的性能。我们还表明，在将这些BN向量扩展为常规i向量时，可以进一步提高性能。在本文中，对Aurora-4，REVERB挑战和AMI数据库进行了实验。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2016年|5025-5029|共5页
会议地点
作者
Souvik Kundu; Gautam Mantena; Yanmin Qian; Tian Tan; Marc Delcroix; Khe Chai Sim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
adaptation; bottleneck vectors; deep neural networks; joint factor learning; robust speech recognition;

机译：适应;瓶颈向量;深度神经网络;联合因子学习;鲁棒语音识别;

相似文献

外文文献
中文文献
专利

1. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
2. Ensemble of jointly trained deep neural network-based acoustic models for reverberant speech recognition [J] . Lee Moa, Lee Jeehye, Chang Joon-Hyuk Digital Signal Processing . 2019,第期

机译：混响语音识别的联合训练深神经网络声学模型的集合
3. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model [J] . He Di, Lim Boon Pang, Yang Xuesong, The Journal of the Acoustical Society of America . 2018,第6aPta1期

机译：声学地标包含与具有深度神经网络声学模型的自动语音识别的其他帧的更多信息
4. Joint acoustic factor learning for robust deep neural network based automatic speech recognition [C] . Souvik Kundu, Gautam Mantena, Yanmin Qian, IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：基于强大的深神经网络自动语音识别的联合声学因素学习
5. Multi-task learning deep neural networks for automatic speech recognition [D] . Chen, Dongpeng. 2015

机译：多任务学习深度神经网络自动语音识别
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION [O] . Deepak Baby, Jort F. Gemmeke, Tuomas Virtanen, 2016

机译：基于示例的语音增强在深度神经网络自动语音识别中的应用

Joint acoustic factor learning for robust deep neural network based automatic speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅