Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization

机译：具有信任正则化的递归神经网络语言建模的知识提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recurrent Neural Networks (RNNs) have dominated language modeling because of their superior performance over traditional N-gram based models. In many applications, a large Recurrent Neural Network language model (RNNLM) or an ensemble of several RNNLMs is used. These models have large memory footprints and require heavy computation. In this paper, we examine the effect of applying knowledge distillation in reducing the model size for RNNLMs. In addition, we propose a trust regularization method to improve the knowledge distillation training for RNNLMs. Using knowledge distillation with trust regularization, we reduce the parameter size to a third of that of the previously published best model while maintaining the state-of-the-art perplexity result on Penn Treebank data. In a speech recognition N-best rescoring task, we reduce the RNNLM model size to 18.5% of the baseline system, with no degradation in word error rate (WER) performance on Wall Street Journal data set.

机译：递归神经网络（RNN）主导了语言建模，因为它们的性能优于传统的基于N-gram的模型。在许多应用中，使用大型递归神经网络语言模型（RNNLM）或多个RNNLM的集合。这些模型具有较大的内存占用空间，并且需要大量的计算。在本文中，我们研究了应用知识蒸馏在减小RNNLM的模型大小方面的作用。此外，我们提出了一种信任正则化方法来改进RNNLM的知识提炼训练。使用具有信任正则化的知识提炼，我们可以将参数大小减少到以前发布的最佳模型的三分之一，同时保持对Penn Treebank数据的最新困惑结果。在语音识别N最佳记录任务中，我们将RNNLM模型的大小减小到基线系统的18.5％，而《华尔街日报》数据集的字错误率（WER）性能没有下降。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2019年|7230-7234|共5页
会议地点
作者
Yangyang Shi; Mei-Yuh Hwang; Xin Lei; Haoyu Sheng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
learning (artificial intelligence); natural language processing; recurrent neural nets; speech recognition;

机译：学习（人工智能）;自然语言处理;递归神经网络;语音识别;

相似文献

外文文献
中文文献
专利

1. Improving Recurrent Neural Networks for Offline Arabic Handwriting Recognition by Combining Different Language Models [J] . Jemni Sana Khamekhem, Kessentini Yousri, Kanoun Slim International Journal of Pattern Recognition and Artificial Intelligence . 2020,第12期

机译：通过组合不同的语言模型，改进反际阿拉伯语手写识别的反复性神经网络
2. Compression of recurrent neural networks for efficient language modeling [J] . Grachev Artem M., Ignatov Dmitry I., Savchenko Andrey V Applied Soft Computing . 2019,第期

机译：用于高效语言建模的经常性神经网络的压缩
3. From Feedforward to Recurrent LSTM Neural Networks for Language Modeling [J] . Sundermeyer Martin, Ney Hermann, Schluter Ralf Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第3期

机译：从前馈到递归LSTM神经网络进行语言建模
4. Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization [C] . Yangyang Shi, Mei-Yuh Hwang, Xin Lei, IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：信任正规化经常性神经网络语言建模的知识蒸馏
5. Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks [D] . Hassan, Abdalraouf. 2018

机译：基于卷积神经网络和递归神经网络的深度神经语言文本分类模型
6. Explaining Neural Networks Using Attentive Knowledge Distillation [O] . Hyeonseok Lee, Sungchan Kim 2021

机译：使用细心知识蒸馏解释神经网络
7. Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization [O] . Yangyang Shi, Mei-Yuh Hwang, Xin Lei, 2019

机译：信任正规化经常性神经网络语言建模的知识蒸馏

Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization

摘要

著录项

相似文献

相关主题

期刊订阅