首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Improved Language Modeling by Decoding the Past
【24h】

Improved Language Modeling by Decoding the Past

机译:通过解码过去改进语言建模

获取原文

摘要

Highly regularized LSTMs achieve impressive results on several benchmark datasets in language modeling. We propose a new regu-larization method based on decoding the last token in the context using the predicted distribution of the next token. This biases the model towards retaining more contextual information, in turn improving its ability to predict the next token. With negligible overhead in the number of parameters and training time, our Past Decode Regularization (PDR) method improves perplexity on the Penn Treebank dataset by up to 1.8 points and by up to 2.3 points on the WikiText-2 dataset, over strong regularized baselines using a single softmax. With a mixture-of-softmax model, we show gains of up to 1.0 perplexity points on these datasets. In addition, our method achieves 1.169 bits-per-character on the Penn Treebank Character dataset for character level language modeling. Each of these results constitute improvements over models without PDR in their respective settings.
机译:高度正常的LSTMS在语言建模中的几个基准数据集上实现令人印象深刻的结果。我们提出了一种基于使用下一个令牌的预测分布在上下文中解码上次令牌的新的Regu-larization方法。这使模型偏向保留更多上下文信息,从而提高其预测下一个令牌的能力。在参数和培训时间的数量下,我们过去的解码正规化(PDR)方法在Penn TreeBank数据集上的困惑提高了最多1.8点,并且在Wikitext-2数据集上逐渐增加了2.3点,使用了强大的正则化基线。使用单个softmax。使用软MAX模型的混合模型,我们在这些数据集中显示了最多1.0个困惑点的收益。此外,我们的方法在Penn TreeBank字符数据集上实现了1.169位的每个字符,用于字符级语言建模。这些结果中的每一个都构成对没有PDR的模型的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号