首页> 外文会议>International conference on cyber security cryptography and machine learning >Highway State Gating for Recurrent Highway Networks: Improving Information Flow Through Time
【24h】

Highway State Gating for Recurrent Highway Networks: Improving Information Flow Through Time

机译:循环公路网的高速公路状态选通:改善信息流

获取原文
获取外文期刊封面目录资料

摘要

Recurrent Neural Networks (RNNs) play a major role in the field of sequential learning, and have outperformed traditional algorithms on many benchmarks. Training deep RNNs still remains a challenge, and most of the state-of-the-art models are structured with a transition depth of 2-4 layers. Recurrent Highway Networks (RHNs) were introduced in order to tackle this issue. These have achieved state-of-the-art performance on a few benchmarks using a depth of 10 layers. However, the performance of this architecture suffers from a bottleneck, and ceases to improve when an attempt is made to add more layers. In this work, we analyze the causes for this, and postulate that the main source is the way that the information flows through time. We introduce a novel and simple variation for the RHN cell, called Highway State Gating (HSG), which allows adding more layers, while continuing to improve performance. By using a gating mechanism for the state, we allow the net to 'choose' whether to pass information directly through time, or to gate it. This mechanism also allows the gradient to back-propagate directly through time and, therefore, results in a slightly faster convergence. We use the Penn Treebank (PTB) dataset as a platform for empirical proof of concept. Empirical results show that the improvement due to Highway State Gating is for all depths, and as the depth increases, the improvement also increases.
机译:递归神经网络(RNN)在顺序学习领域起着重要作用,并且在许多基准测试中都优于传统算法。训练深度RNN仍然是一个挑战,大多数最新模型的构造深度为2-4层。为了解决这个问题,引入了循环公路网(RHN)。这些仪器使用10层深度,在一些基准测试中获得了最先进的性能。但是,该体系结构的性能存在瓶颈,并且在尝试添加更多层时停止提高。在这项工作中,我们分析了造成这种情况的原因,并假定主要来源是信息流经时间的方式。我们为RHN单元引入了一种新颖而简单的变体,称为高速公路状态选通(HSG),它可以添加更多层,同时继续提高性能。通过为状态使用门控机制,我们允许网络“选择”是直接通过时间传递信息还是对其进行门控。该机制还允许梯度直接在时间上反向传播,因此会导致收敛速度稍快。我们将Penn Treebank(PTB)数据集用作概念经验证明的平台。实证结果表明,由于高速公路状态浇口的改善适用于所有深度,并且随着深度的增加,改善程度也随之增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号