...
首页> 外文期刊>PLoS One >G2Basy: A framework to improve the RNN language model and ease overfitting problem
【24h】

G2Basy: A framework to improve the RNN language model and ease overfitting problem

机译:G2Basy:一个改善RNN语言模型的框架,缓解过度拟合问题

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Recurrent neural networks are efficient ways of training language models, and various RNN networks have been proposed to improve performance. However, with the increase of network scales, the overfitting problem becomes more urgent. In this paper, we propose a framework—G2Basy—to speed up the training process and ease the overfitting problem. Instead of using predefined hyperparameters, we devise a gradient increasing and decreasing technique that changes the parameters training batch size and input dropout simultaneously by a user-defined step size. Together with a pretrained word embedding initialization procedure and the introduction of different optimizers at different learning rates, our framework speeds up the training process dramatically and improves performance compared with a benchmark model of the same scale. For the word embedding initialization, we propose the concept of “artificial features” to describe the characteristics of the obtained word embeddings. We experiment on two of the most often used corpora—the Penn Treebank and WikiText-2 datasets—and both outperform the benchmark results and show potential towards further improvement. Furthermore, our framework shows better results with the larger and more complicated WikiText-2 corpus than with the Penn Treebank. Compared with other state-of-the-art results, we achieve comparable results with network scales hundreds of times smaller and within fewer training epochs.
机译:经常性神经网络是有效的培训语言模型方式,并提出了各种RNN网络来提高性能。然而,随着网络尺度的增加,过度拟合问题变得更加紧迫。在本文中,我们提出了一个框架-G2Basy - 加快培训过程,缓解过度装备问题。我们而不是使用预定义的超参数,我们设计了渐变的增加和减少技术,通过用户定义的步长同时改变参数训练批量大小和输入辍学的技术。在不同的学习速率下,与倒置的单词嵌入初始化程序和引入不同优化器的引入,我们的框架急剧增加培训过程,并与相同规模的基准模型相比提高性能。对于嵌入初始化的单词,我们提出了“人工特征”的概念来描述所获得的单词嵌入的特征。我们试验两个最常用的基层 - 宾夕法尼亚州的宾布和Wikitext-2数据集 - 并且两者都优于基准结果,并表现出进一步改进的潜力。此外,我们的框架显示出比与Penn TreeBank更大更复杂的Wikitext-2语料库更好的结果。与其他最先进的结果相比,我们实现了与网络尺度的可比结果数百次,并且在较少的训练时期。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号