Comparison of Various Neural Network Language Models in Speech Recognition

机译：语音识别中各种神经网络语言模型的比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three competing concepts have been developed: Firstly, feed forward neural networks representing an n-gram approach, Secondly, recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words, Thirdly, the long short-term memory (LSTM) neural networks can fully exploits the correlation on a telephone conversation corpus. In this paper, we compare count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks. Furthermore, we put forward a language model estimation method introduced the information of history sentences. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. The experimental results show that the performance of LSTM neural network language model is optimal in n-best lists re-score. Compared to the first pass decoding, the relative decline in average word error rate is 4.3% when using ten candidate results to re-score in conversational telephone speech recognition tasks.

机译：近年来，对用于语音识别的语言建模的研究越来越集中于神经网络的应用。但是，神经网络语言模型的性能在很大程度上取决于其体系结构。已经开发出三个相互竞争的概念：首先，前馈神经网络代表一种n-gram方法;其次，递归神经网络可以学习跨越固定数量的前单词的上下文相关性;第三，长短期记忆（LSTM）神经网络可以充分利用电话会话语料库上的相关性。在本文中，我们比较了计数模型在对话电话语音识别任务中的前馈，递归和LSTM神经网络。此外，我们提出了一种语言模型估计方法，介绍了历史句子的信息。我们根据困惑度和单词错误率评估模型，通过实验验证这两个数量的强相关性，无论语言模型的基础类型如何，我们都认为这两个相关性很强。实验结果表明，LSTM神经网络语言模型的性能在n个最佳列表的重新评分中是最佳的。与首遍解码相比，当使用十个候选结果对会话电话语音识别任务进行重新评分时，平均单词错误率的相对下降为4.3％。

著录项

来源
《International Conference on Information Science and Control Engineering》|2016年|894-898|共5页
会议地点
作者
Lingyun Zuo; Xin Wan; Jian Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech recognition; Error analysis; Computational modeling; Logic gates; Recurrent neural networks; Speech;

机译：语音识别;错误分析;计算建模;逻辑门;递归神经网络;语音;

相似文献

外文文献
中文文献
专利

1. Investigation of Automatic Speech Recognition Systems via the Multilingual Deep Neural Network Modeling Methods for a Very Low-Resource Language, Chaha [J] . Tessfu Geteye Fantaye, Junqing Yu, Tulu Tilahun Hailu Journal of Signal and Information Processing . 2020,第1期

机译：Chaha非常低于资源语言的多语言深神经网络建模方法对自动语音识别系统的研究
2. Investigation of Automatic Speech Recognition Systems via the Multilingual Deep Neural Network Modeling Methods for a Very Low-Resource Language, Chaha [J] . Tessfu Geteye Fantaye, Junqing Yu, Tulu Tilahun Hailu 信号与信息处理（英文） . 2020,第001期

机译：资源非常少的语言Chaha通过多语言深层神经网络建模方法研究自动语音识别系统
3. Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition [J] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, IEICE transactions on information and systems . 2019,第12期

机译：潜在词递归神经网络语言模型用于自动语音识别
4. Comparison of Various Neural Network Language Models in Speech Recognition [C] . Lingyun Zuo, Jian Liu, Xin Wan International Conference on Information Science and Control Engineering . 2016

机译：语音识别中各种神经网络语言模型的比较
5. Modeling and learning in speech recognition: The relationship between stochastic pattern classifiers and neural networks. [D] . Niles, Leslie Thomas. 1991

机译：语音识别中的建模和学习：随机模式分类器与神经网络之间的关系。
6. Transformer-based deep neural network language models for Alzheimer’s disease risk assessment from targeted speech [O] . Alireza Roshanzamir, Hamid Aghajan, Mahdieh Soleymani Baghshah 2021

机译：基于变压器的深神经网络语言模型用于阿尔茨海默病风险评估来自目标言论
7. Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition [O] . Xie Chen, Xunying Liu, Yu Wang, 2019

机译：用于语音识别神经网络语言模型的未来词背景

Comparison of Various Neural Network Language Models in Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅