首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning Longer-term Dependencies in RNNs with Auxiliary Losses
【24h】

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

机译:学习带有辅助损失的RNN中的长期依赖关系

获取原文
       

摘要

Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective. This auxiliary loss forces RNNs to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full BPTT. We evaluate our method on a variety of settings, including pixel-by-pixel image classification with sequence lengths up to 16000, and a real document classification benchmark. Our results highlight good performance and resource efficiency of this approach over competitive baselines, including other recurrent models and a comparable sized Transformer. Further analyses reveal beneficial effects of the auxiliary loss on optimization and regularization, as well as extreme cases where there is little to no backpropagation.
机译:尽管最近在训练递归神经网络(RNN)方面取得了进步,但是捕获序列中的长期依存关系仍然是一个基本挑战。大多数方法使用时间反向传播(BPTT),很难将其扩展到非常长的序列。本文提出了一种简单的方法,通过向原始目标添加无监督辅助损失来提高捕获RNN中长期依赖项的能力。这种辅助损失迫使RNN重新构建序列中的先前事件或预测下一事件,从而使截短的反向传播对于长序列而​​言是可行的,并且还改善了完整的BPTT。我们在各种设置上评估我们的方法,包括序列长度最大为16000的逐像素图像分类,以及真实的文档分类基准。我们的结果强调了该方法在竞争基准(包括其他递归模型和可比较大小的Transformer)上的良好性能和资源效率。进一步的分析揭示了辅助损耗对优化和正则化的好处,以及极少甚至没有反向传播的极端情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号