首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization
【24h】

Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization

机译:具有信任正则化的递归神经网络语言建模的知识提取

获取原文

摘要

Recurrent Neural Networks (RNNs) have dominated language modeling because of their superior performance over traditional N-gram based models. In many applications, a large Recurrent Neural Network language model (RNNLM) or an ensemble of several RNNLMs is used. These models have large memory footprints and require heavy computation. In this paper, we examine the effect of applying knowledge distillation in reducing the model size for RNNLMs. In addition, we propose a trust regularization method to improve the knowledge distillation training for RNNLMs. Using knowledge distillation with trust regularization, we reduce the parameter size to a third of that of the previously published best model while maintaining the state-of-the-art perplexity result on Penn Treebank data. In a speech recognition N-best rescoring task, we reduce the RNNLM model size to 18.5% of the baseline system, with no degradation in word error rate (WER) performance on Wall Street Journal data set.
机译:递归神经网络(RNN)主导了语言建模,因为它们的性能优于传统的基于N-gram的模型。在许多应用中,使用大型递归神经网络语言模型(RNNLM)或多个RNNLM的集合。这些模型具有较大的内存占用空间,并且需要大量的计算。在本文中,我们研究了应用知识蒸馏在减小RNNLM的模型大小方面的作用。此外,我们提出了一种信任正则化方法来改进RNNLM的知识提炼训练。使用具有信任正则化的知识提炼,我们可以将参数大小减少到以前发布的最佳模型的三分之一,同时保持对Penn Treebank数据的最新困惑结果。在语音识别N最佳记录任务中,我们将RNNLM模型的大小减小到基线系统的18.5%,而《华尔街日报》数据集的字错误率(WER)性能没有下降。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号