首页> 外文会议>Workshop on representation learning for NLP >Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs
【24h】

Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

机译:递归神经网络中的消失梯度和长距离依赖性问题和递归LSTMS

获取原文

摘要

Recursive neural networks (RNN) and their recently proposed extension recursive long short term memory networks (RLSTM) are models that compute representations for sentences, by recursively combining word embeddings according to an externally provided parse tree. Both models thus, unlike recurrent networks, explicitly make use of the hierarchical structure of a sentence. In this paper, we demonstrate that RNNs nevertheless suffer from the vanishing gradient and long distance dependency problem, and that RLSTMs greatly improve over RNN's on these problems. We present an artificial learning task that allows us to quantify the severity of these problems for both models. We further show that a ratio of gradients (at the root node and a focal leaf node) is highly indicative of the success of backpropagation at optimizing the relevant weights low in the tree. This paper thus provides an explanation for existing, superior results of RLSTMs on tasks such as sentiment analysis, and suggests that the benefits of including hierarchical structure and of including LSTM-style gating are complementary.
机译:递归神经网络(RNN)及其最近提出的扩展递归长短短期内存网络(RLSTM)是根据从外部提供的解析树递归地组合Word Embeddings的句子来计算句子的表示的模型。因此,这两个模型与经常性网络不同,明确地利用句子的分层结构。在本文中,我们证明RNN仍然存在消失的梯度和长距离依赖性问题,并且RLSTMS在这些问题上大大改善了RNN。我们提出了一个人工学习任务,允许我们量化两种模型的这些问题的严重程度。我们进一步示出了梯度(在根节点和焦点叶节点处)的比率高度指示在优化树中的低低的相关权重成功。因此,本文提供了对诸如情感分析的任务的现有,卓越的结果的解释,并提出包括分层结构和包括LSTM样式门控的益处是互补的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号