Does an LSTM forget more than a CNN? An empirical study of catastrophic forgetting in NLP

机译：LSTM会比CNN遗忘更多吗？ NLP中灾难性遗忘的实证研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Catastrophic forgetting - whereby a model trained on one task is fine-tuned on a second, and in doing so, suffers a "catastrophic" drop in performance over the first task - is a hurdle in the development of better transfer learning techniques. Despite impressive progress in reducing catastrophic forgetting, we have limited understanding of how different architectures and hyper-parameters affect forgetting in a network. In this paper, we aim to understand factors which cause forgetting during sequential training. Our primary finding is that CNNs forget less than LSTMs. We show that max-pooling is the underlying operation which helps CNNs alleviate forgetting compared to LSTMs. We also found that curriculum learning (Bengio et al., 2009), placing a hard task towards the end of task sequence, reduces forgetting. We analysed the effect of fine-tuning contextual embeddings on catastrophic forgetting, and found that using fixed word embeddings is preferable to fine-tuning.

机译：灾难性的遗忘-在一个任务上训练的模型可以在第二个任务上进行微调，并且这样做会导致性能比第一个任务“灾难性”地下降-是开发更好的迁移学习技术的障碍。尽管在减少灾难性遗忘方面取得了令人瞩目的进展，但是我们对不同的体系结构和超参数如何影响网络中的遗忘知之甚少。在本文中，我们旨在了解导致顺序训练中遗忘的因素。我们的主要发现是CNN的遗忘少于LSTM。我们显示，最大池化是底层的操作，与LSTM相比，它可以帮助CNN减轻遗忘。我们还发现，课程学习（Bengio等，2009）将艰巨的任务放在任务序列的末尾，可以减少遗忘。我们分析了微调上下文嵌入对灾难性遗忘的影响，发现使用固定单词嵌入比微调更可取。

著录项

来源
《Workshop of the Australasian Language Technology Association》|2019年|47-56|共10页
会议地点
作者
Gaurav Arora; Afshin Rahimi; Timothy Baldwin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Alternatives to Simply Forgiving and Forgetting:Comparing Techniques in Hypnosis, NLP and Time Line Therapy~TM in Reducing the Intensity of Memories of Stressful Events [J] . Kamarul Zaman Bin Ahmad Stress and health: journal of the International Society for the Investigation of Stress . 2011,第3期

机译：简单地原谅和遗忘的替代方法：催眠，NLP和时间线疗法〜TM的比较技术，可减少压力性事件的记忆强度
2. Forgetting cues are ineffective in promoting forgetting in the item-method directed forgetting paradigm [J] . Gao Heming, Qi Mingming, Zhang Qi International journal of psychophysiology: official journal of the International Organization of Psychophysiology . 2019,第期

机译：忘记提示在促进忘记范式的项目方法中忘记忘记
3. When Is Forgetting Not Forgetting? A Discursive Analysis of Differences in Forgetting Talk Between Adults With Cystic Fibrosis With Different Levels of Adherence to Nebulizer Treatments [J] . Drabble Sarah J., OCathain Alicia, Arden Madelynne A., Qualitative health research . 2019,第14期

机译：什么时候忘记不会忘记？对囊性纤维化与雾化器治疗不同水平的遗迹讨论差异的话语分析
4. Does an LSTM forget more than a CNN? An empirical study of catastrophic forgetting in NLP [C] . Gaurav Arora, Afshin Rahimi, Timothy Baldwin Workshop of the Australasian Language Technology Association . 2019

机译：LSTM是否忘记超过CNN？ NLP灾难性遗忘的实证研究
5. Developing toward Generality: Combating Catastrophic Forgetting with Developmental Compression [D] . Beaulieu, Shawn L.E. 2018

机译：走向通用：与发展性压缩对抗灾难性的遗忘
6. ‘Forget me (not)?’ – Remembering Forget-Items Versus Un-Cued Items in Directed Forgetting [O] . Bastian Zwissler, Sebastian Schindler, Helena Fischer, -1

机译：忘记我（不是）吗？ –记住定向遗忘中的遗忘物品与未提示的物品
7. An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks [O] . Goodfellow, Ian J., Mirza, Mehdi, Xiao, Da, 2015

机译：基于梯度的灾难性遗忘的实证研究神经网络

Does an LSTM forget more than a CNN? An empirical study of catastrophic forgetting in NLP

摘要

著录项

相似文献

相关主题

期刊订阅