Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Trieu Trinh; Andrew Dai; Thang Luong; Quoc Le

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning Longer-term Dependencies in RNNs with Auxiliary Losses

【24h】

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

机译：学习带有辅助损失的RNN中的长期依赖关系

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective. This auxiliary loss forces RNNs to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full BPTT. We evaluate our method on a variety of settings, including pixel-by-pixel image classification with sequence lengths up to 16000, and a real document classification benchmark. Our results highlight good performance and resource efficiency of this approach over competitive baselines, including other recurrent models and a comparable sized Transformer. Further analyses reveal beneficial effects of the auxiliary loss on optimization and regularization, as well as extreme cases where there is little to no backpropagation.

机译：尽管最近在训练递归神经网络（RNN）方面取得了进步，但是捕获序列中的长期依存关系仍然是一个基本挑战。大多数方法使用时间反向传播（BPTT），很难将其扩展到非常长的序列。本文提出了一种简单的方法，通过向原始目标添加无监督辅助损失来提高捕获RNN中长期依赖项的能力。这种辅助损失迫使RNN重新构建序列中的先前事件或预测下一事件，从而使截短的反向传播对于长序列而言是可行的，并且还改善了完整的BPTT。我们在各种设置上评估我们的方法，包括序列长度最大为16000的逐像素图像分类，以及真实的文档分类基准。我们的结果强调了该方法在竞争基准（包括其他递归模型和可比较大小的Transformer）上的良好性能和资源效率。进一步的分析揭示了辅助损耗对优化和正则化的好处，以及极少甚至没有反向传播的极端情况。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共10页
作者
Trieu Trinh; Andrew Dai; Thang Luong; Quoc Le;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Learning Longer-term Dependencies in RNNs with Auxiliary Losses [J] . Trieu Trinh, Andrew Dai, Thang Luong, JMLR: Workshop and Conference Proceedings . 2018,第4期

机译：学习带有辅助损失的RNN中的长期依赖关系
2. Learning longer-term dependencies via grouped distributor unit [J] . Luo Wei, Yu Feng Neurocomputing . 2020,第Octa28期

机译：通过分组的分销商单元学习长期依赖关系
3. Drug-drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths [J] . Zhang Yijia, Zheng Wei, Lin Hongfei, Bioinformatics . 2018,第5期

机译：通过序列和最短依赖路径的分层RNN来提取药物 - 药物相互作用
4. Recognizing semantic relation in sentence pairs using Tree-RNNs and Typed dependencies [C] . Jeena Kleenankandy, K A Abdul Nazeer IEEE Congress on Information Science and Technology . 2021

机译：使用Tree-RNN和键入依赖项识别句子对中的语义关系
5. A zero voltage switching boost converter using a soft switching auxiliary circuit with reduced conduction losses. [D] . Jain, Nikhil. 2001

机译：一种使用软开关辅助电路的零电压开关升压转换器，具有降低的传导损耗。
6. Drug–drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths [O] . Yijia Zhang, Wei Zheng, Hongfei Lin, -1

机译：通过序列和最短依赖路径上的分层RNN进行药物相互作用
7. Visual Weather Property Prediction by Multi-Task Learning and Two-Dimensional RNNs [O] . Wei-Ta Chu, Yu-Hsuan Liang, Kai-Chia Ho 2021

机译：Visual天气属性通过多任务学习和二维RNN的预测

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

摘要

著录项

相似文献

相关主题

期刊订阅