Improved Language Modeling by Decoding the Past

机译：通过解码过去改进语言建模

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Highly regularized LSTMs achieve impressive results on several benchmark datasets in language modeling. We propose a new regu-larization method based on decoding the last token in the context using the predicted distribution of the next token. This biases the model towards retaining more contextual information, in turn improving its ability to predict the next token. With negligible overhead in the number of parameters and training time, our Past Decode Regularization (PDR) method improves perplexity on the Penn Treebank dataset by up to 1.8 points and by up to 2.3 points on the WikiText-2 dataset, over strong regularized baselines using a single softmax. With a mixture-of-softmax model, we show gains of up to 1.0 perplexity points on these datasets. In addition, our method achieves 1.169 bits-per-character on the Penn Treebank Character dataset for character level language modeling. Each of these results constitute improvements over models without PDR in their respective settings.

机译：高度正常的LSTMS在语言建模中的几个基准数据集上实现令人印象深刻的结果。我们提出了一种基于使用下一个令牌的预测分布在上下文中解码上次令牌的新的Regu-larization方法。这使模型偏向保留更多上下文信息，从而提高其预测下一个令牌的能力。在参数和培训时间的数量下，我们过去的解码正规化（PDR）方法在Penn TreeBank数据集上的困惑提高了最多1.8点，并且在Wikitext-2数据集上逐渐增加了2.3点，使用了强大的正则化基线。使用单个softmax。使用软MAX模型的混合模型，我们在这些数据集中显示了最多1.0个困惑点的收益。此外，我们的方法在Penn TreeBank字符数据集上实现了1.169位的每个字符，用于字符级语言建模。这些结果中的每一个都构成对没有PDR的模型的改进。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|cxxxiv p. 1324-1979|共9页
会议地点
作者
Siddhartha Brahma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Improved Modeling of Cross-Decoder Phone Co-Occurrences in SVM-Based Phonotactic Language Recognition [J] . Penagarikano M., Varona A., Rodriguez-Fuentes L. J., Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第8期

机译：基于支持向量机的语音策略语言识别中跨解码器电话共现的改进建模
2. Efficient Embedded Decoding of Neural Network Language Models in a Machine Translation System [J] . Francisco Zamora-Martinez, Maria Jose Castro-Bleda International Journal of Neural Systems . 2018,第9期

机译：高效嵌入式解码机器翻译系统中的神经网络语言模型
3. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
4. Improved Language Modeling by Decoding the Past [C] . Siddhartha Brahma Annual meeting of the Association for Computational Linguistics . 2019

机译：通过解码过去改进语言建模
5. Does L2 word decoding imply L2 meaning activation? Relationships among decoding, meaning identification, and L2 oral language proficiency in reading Spanish as a second language. [D] . Saiz, Marina. 2007

机译：L2字解码是否暗示L2意思激活？阅读西班牙语作为第二语言时，解码，含义识别和第二语言口语能力之间的关系。
6. A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins [O] . Piero Fariselli, Pier Luigi Martelli, Rita Casadio 2005

机译：隐马尔可夫模型的新解码算法改善了全β膜蛋白拓扑结构的预测
7. Improved Language Modeling by Decoding the Past [O] . Siddhartha Brahma 2019

机译：通过解码过去改进语言建模

Improved Language Modeling by Decoding the Past

摘要

著录项

相似文献

相关主题

期刊订阅