首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
【24h】

Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

机译:独立递归神经网络(IndRNN):建立更长更深的RNN

获取原文

摘要

Recurrent neural networks (RNNs) have been widely used for processing sequential data. However, RNNs are commonly difficult to train due to the well-known gradient vanishing and exploding problems and hard to learn long-term patterns. Long short-term memory (LSTM) and gated recurrent unit (GRU) were developed to address these problems, but the use of hyperbolic tangent and the sigmoid action functions results in gradient decay over layers. Consequently, construction of an efficiently trainable deep network is challenging. In addition, all the neurons in an RNN layer are entangled together and their behaviour is hard to interpret. To address these problems, a new type of RNN, referred to as independently recurrent neural network (IndRNN), is proposed in this paper, where neurons in the same layer are independent of each other and they are connected across layers. We have shown that an IndRNN can be easily regulated to prevent the gradient exploding and vanishing problems while allowing the network to learn long-term dependencies. Moreover, an IndRNN can work with non-saturated activation functions such as relu (rectified linear unit) and be still trained robustly. Multiple IndRNNs can be stacked to construct a network that is deeper than the existing RNNs. Experimental results have shown that the proposed IndRNN is able to process very long sequences (over 5000 time steps), can be used to construct very deep networks (21 layers used in the experiment) and still be trained robustly. Better performances have been achieved on various tasks by using IndRNNs compared with the traditional RNN and LSTM.
机译:递归神经网络(RNN)已被广泛用于处理顺序数据。但是,由于众所周知的梯度消失和爆炸问题,RNN通常很难训练,并且很难学习长期模式。开发了长短期记忆(LSTM)和门控循环单元(GRU)来解决这些问题,但是使用双曲线正切和S型作用函数会导致层上的梯度衰减。因此,构建有效可训练的深度网络具有挑战性。此外,RNN层中的所有神经元都纠缠在一起,其行为难以解释。为了解决这些问题,本文提出了一种新型的RNN,称为独立递归神经网络(IndRNN),其中同一层中的神经元彼此独立并且跨层连接。我们已经表明,可以轻松调节IndRNN,以防止梯度爆炸和消失的问题,同时允许网络学习长期依赖关系。此外,IndRNN可以与非饱和激活功能(例如relu(整流线性单元))一起使用,并且仍需经过严格培训。可以堆叠多个IndRNN,以构建比现有RNN更深的网络。实验结果表明,所提出的IndRNN能够处理非常长的序列(超过5000个时间步长),可用于构建非常深的网络(实验中使用的21层),并且仍然经过严格训练。与传统的RNN和LSTM相比,使用IndRNN可以在各种任务上实现更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号