G2Basy: A framework to improve the RNN language model and ease overfitting problem

Lu Yuwen; Shuyu Chen; Xiaohan Yuan

首页> 外文期刊>PLoS One >G2Basy: A framework to improve the RNN language model and ease overfitting problem

【24h】

G2Basy: A framework to improve the RNN language model and ease overfitting problem

机译：G2Basy：一个改善RNN语言模型的框架，缓解过度拟合问题

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recurrent neural networks are efficient ways of training language models, and various RNN networks have been proposed to improve performance. However, with the increase of network scales, the overfitting problem becomes more urgent. In this paper, we propose a framework—G2Basy—to speed up the training process and ease the overfitting problem. Instead of using predefined hyperparameters, we devise a gradient increasing and decreasing technique that changes the parameters training batch size and input dropout simultaneously by a user-defined step size. Together with a pretrained word embedding initialization procedure and the introduction of different optimizers at different learning rates, our framework speeds up the training process dramatically and improves performance compared with a benchmark model of the same scale. For the word embedding initialization, we propose the concept of “artificial features” to describe the characteristics of the obtained word embeddings. We experiment on two of the most often used corpora—the Penn Treebank and WikiText-2 datasets—and both outperform the benchmark results and show potential towards further improvement. Furthermore, our framework shows better results with the larger and more complicated WikiText-2 corpus than with the Penn Treebank. Compared with other state-of-the-art results, we achieve comparable results with network scales hundreds of times smaller and within fewer training epochs.

机译：经常性神经网络是有效的培训语言模型方式，并提出了各种RNN网络来提高性能。然而，随着网络尺度的增加，过度拟合问题变得更加紧迫。在本文中，我们提出了一个框架-G2Basy - 加快培训过程，缓解过度装备问题。我们而不是使用预定义的超参数，我们设计了渐变的增加和减少技术，通过用户定义的步长同时改变参数训练批量大小和输入辍学的技术。在不同的学习速率下，与倒置的单词嵌入初始化程序和引入不同优化器的引入，我们的框架急剧增加培训过程，并与相同规模的基准模型相比提高性能。对于嵌入初始化的单词，我们提出了“人工特征”的概念来描述所获得的单词嵌入的特征。我们试验两个最常用的基层 - 宾夕法尼亚州的宾布和Wikitext-2数据集 - 并且两者都优于基准结果，并表现出进一步改进的潜力。此外，我们的框架显示出比与Penn TreeBank更大更复杂的Wikitext-2语料库更好的结果。与其他最先进的结果相比，我们实现了与网络尺度的可比结果数百次，并且在较少的训练时期。

著录项

来源
《PLoS One》 |2021年第4期|共18页
作者
Lu Yuwen; Shuyu Chen; Xiaohan Yuan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词

相似文献

外文文献
中文文献
专利

1. An Improved Framework for Recognizing Highly Imbalanced Bilingual Code-Switched Lectures with Cross-Language Acoustic Modeling and Frame-Level Language Identification [J] . Yeh Ching-Feng, Lee Lin-Shan Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第7期

机译：跨语言声学建模和框架级语言识别的高度识别双语代码转换演讲的改进框架
2. Domain-specific modeling languages to improve framework instantiation [J] . Goran Trajkovski Computing reviews . 2014,第4期

机译：特定领域的建模语言可改善框架实例化
3. Domain-Specific Modeling Languages to improve framework instantiation [J] . Matheus C. Viana, Rosangela A.D. Penteado, Antonio F. do Prado The Journal of Systems and Software . 2013,第12期

机译：特定领域的建模语言，可改善框架实例化
4. Comparing RNNs and log-linear interpolation of improved skip-model on four Babel languages: Cantonese, Pashto, Tagalog, Turkish [C] . Singh Mittul, Klakow Dietrich IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：在四种Babel语言上比较RNN和改进的跳跃模型的对数线性插值：四种粤语，普什图语，他加禄语，土耳其语
5. Low-Rank RNN Adaptation for Context-Aware Language Modeling [D] . Jaech, Aaron. 2018

机译：低级RNN适应上下文语言建模
6. G2Basy: A framework to improve the RNN language model and ease overfitting problem [O] . Lu Yuwen, Shuyu Chen, Xiaohan Yuan 2021

机译：G2Basy：一种提高RNN语言模型的框架缓解过度装备问题
7. COMPARING RNNS AND LOG-LINEAR INTERPOLATION OF IMPROVED SKIP-MODEL ON FOUR BABEL LANGUAGES: CANTONESE, PASHTO, TAGALOG, TURKISH [O] . Mittul Singh, Dietrich Klakow 2014

机译：比较四种语言语言中改进的跳过模型的RNNs和对数线插值：CaNTONEsE，pasHTO，TaGaLOG，TURKIsH

G2Basy: A framework to improve the RNN language model and ease overfitting problem

摘要

著录项

相似文献

相关主题

期刊订阅