Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition

机译：预训练深层神经网络在大词汇语音识别中的应用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The use of Deep Belief Networks (DBN) to pretrain Neural Networks has recently led to a resurgence in the use of Artificial Neural Network - Hidden Markov Model (ANN/HMM) hybrid systems for Automatic Speech Recognition (ASR). In this paper we report results of a DBN-pretrained context-dependent ANN/HMM system trained on two datasets that are much larger than any reported previously with DBN-pretrained ANN/HMM systems - 5870 hours of Voice Search and 1400 hours of YouTube data. On the first dataset, the pretrained ANN/HMM system outperforms the best Gaussian Mixture Model - Hidden Markov Model (GMM/HMM) baseline, built with a much larger dataset by 3.7% absolute WER, while on the second dataset, it outperforms the GMM/HMM baseline by 4.7% absolute. Maximum Mutual Information (MMI) fine tuning and model combination using Segmental Conditional Random Fields (SCARF) give additional gains of 0.1% and 0.4% on the first dataset and 0.5% and 0.9% absolute on the second dataset.

机译：深度信念网络（DBN）用于对神经网络进行预训练最近导致使用人工神经网络-隐马尔可夫模型（ANN / HMM）混合系统进行自动语音识别（ASR）的复兴。在本文中，我们报告了在两个数据集上经过训练的DBN预训练的上下文相关ANN / HMM系统的结果，该数据集比以前使用DBN预训练的ANN / HMM系统报告的任何数据集都大得多-5870小时的语音搜索和1400小时的YouTube数据。在第一个数据集上，经过预训练的ANN / HMM系统的性能优于最佳高斯混合模型-隐马尔可夫模型（GMM / HMM）基线，该基线由一个更大的数据集（绝对WER为3.7％）构成，而在第二个数据集上，它的性能优于GMM / HMM基准降低了4.7％（绝对值）。使用分段条件随机字段（SCARF）进行的最大互信息（MMI）精细调整和模型组合在第一个数据集上的附加收益分别为0.1％和0.4％，在第二个数据集上的附加收益分别为0.5％和0.9％。

著录项

来源
《Annual conference of the International Speech Communication Association》|2012年|2577-2580|共4页
会议地点
作者
Navdeep Jaitly; Patrick Nguyen; Andrew Senior; Vincent Vanhoucke;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep Belief Networks; Acoustic Model-ing; Artificial Neural Network; ANN/HMM;

机译：深度信仰网络;声学建模;人工神经网络;神经网络/ HMM;

相似文献

外文文献
中文文献
专利

1. Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition [J] . Jibin Wu, Emre Y?lmaz, Malu Zhang, Frontiers in Neuroscience . 2020,第4期

机译：大型词汇自动语音识别深尖峰神经网络
2. A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition [J] . Li Xiangang, Yang Yuning, Pang Zaihu, Neurocomputing . 2015,第deca25期

机译：基于大词汇量中文语音识别的深度神经网络中声学建模单元选择的比较研究
3. An Unsuper vised Adaptation Method for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition [J] . Yeming Xiao, Yujing Si, Ji Xu, Journal of information and computational science . 2014,第14期

机译：基于深度神经网络的大词汇量连续语音识别的无监督自适应方法
4. Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition [C] . Navdeep Jaitly, Patrick Nguyen, Andrew Senior, INTERSPEECH 2012 . 2012

机译：预磨料深神经网络在大型词汇识别中的应用
5. Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks. [D] . Pillai, Suhas Balkrishna. 2017

机译：使用深度神经网络的表情异常语音识别和离线手写识别。
6. Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition [O] . Jibin Wu, Emre Yılmaz, Malu Zhang, 2020

机译：大型词汇自动语音识别深尖峰神经网络
7. The deep tensor neural network with applications to large vocabulary speech recognition [O] . Dong Yu, Senior Member 2013

机译：深张量神经网络在大词汇量语音识别中的应用

Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅