首页> 外文期刊>Neural computation >Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
【24h】

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

机译:深度学习的人工神经变异:过度装备,噪音记忆和灾难性遗忘

获取原文
获取原文并翻译 | 示例

摘要

Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labeled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability, it is well known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus, it motivates us to design a similar mechanism, named artificial neural variability (ANV), that helps artificial neural networks learn some advantages from “natural” neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
机译:深入学习经常受到很少存在于自然神经系统的两个严重问题的批评:过度装备和灾难性的遗忘。它甚至可以记住随机标记的数据,这在实例标签对后面几乎没有知识。当深度网络通过容纳新任务时不断地学习,它通常会迅速覆盖从先前任务中学到的知识。简称神经变异性,在神经科学中是众所周知的,即使响应于相同的刺激,人脑反应也表现出实质性变异性。这种机制在自然神经系统的电机学习中平衡了精确度和可塑性/灵活性。因此,它激励我们设计类似的机制,命名为人工神经变异性(ANV),这有助于人工神经网络学习来自“自然”神经网络的一些优势。我们严格证明ANV作为培训数据和学习模型之间的互信息的隐式规范器。这一结果理论上保证了ANV的严格改进的概括性,鲁棒性来标记噪音,以及对灾难性遗忘的鲁棒性。然后,我们设计了神经可变风险最小化(NVRM)框架和神经变量优化器,以在实践中实现传统网络架构的ANV。实证研究表明,NVRM可以以可忽略的成本有效地缓解过度拟合,标签噪音记忆和灾难性遗忘。

著录项

  • 来源
    《Neural computation》 |2021年第8期|2163-2192|共30页
  • 作者单位

    University of Tokyo Bunkyo-ku Tokyo 113-0333 Japan and RIKEN Center for AIP Chuo-ku Tokyo 103-0027 Japan;

    University of Sydney Level 1 Chippendale NSW 2008 Australia;

    University of Sydney Level 1 Chippendale NSW 2008 Australia;

    University of Tokyo Bunkyo-ku Tokyo 113-0333 Japan and RIKEN Center for AIP Chuo-ku Tokyo 103-0027 Japan;

    University of Sydney Level 1 Chippendale NSW 2008 Australi;

    RIKEN Center for AIP Chuo-ku Tokyo 103-0027 Japan and University of Tokyo Bunkyo-ku Tokyo 113-0333 Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号