首页> 外文期刊>IEEE Transactions on Image Processing >Interpreting and Improving Adversarial Robustness of Deep Neural Networks With Neuron Sensitivity
【24h】

Interpreting and Improving Adversarial Robustness of Deep Neural Networks With Neuron Sensitivity

机译:用神经元敏感性解释和改善深层神经网络的对抗鲁棒性

获取原文
获取原文并翻译 | 示例

摘要

Deep neural networks (DNNs) are vulnerable to adversarial examples where inputs with imperceptible perturbations mislead DNNs to incorrect results. Despite the potential risk they bring, adversarial examples are also valuable for providing insights into the weakness and blind-spots of DNNs. Thus, the interpretability of a DNN in the adversarial setting aims to explain the rationale behind its decision-making process and makes deeper understanding which results in better practical applications. To address this issue, we try to explain adversarial robustness for deep models from a new perspective of neuron sensitivity which is measured by neuron behavior variation intensity against benign and adversarial examples. In this paper, we first draw the close connection between adversarial robustness and neuron sensitivities, as sensitive neurons make the most non-trivial contributions to model predictions in the adversarial setting. Based on that, we further propose to improve adversarial robustness by stabilizing the behaviors of sensitive neurons. Moreover, we demonstrate that state-of-the-art adversarial training methods improve model robustness by reducing neuron sensitivities, which in turn confirms the strong connections between adversarial robustness and neuron sensitivity. Extensive experiments on various datasets demonstrate that our algorithm effectively achieves excellent results. To the best of our knowledge, we are the first to study adversarial robustness using neuron sensitivities.
机译:深度神经网络(DNN)容易受到对抗的例子,其中具有难以察觉的扰动的输入误导DNN以不正确的结果。尽管他们带来了潜在的风险,但对抗性的例子也很有价值对DNN的弱点和盲斑提供洞察力。因此,对抗性环境中DNN的可解释性旨在解释其决策过程背后的理由,并更深入地了解,这导致更好的实际应用。为了解决这个问题,我们尝试从神经元敏感性的新视角解释深层模型的对抗稳健性,这是通过针对良性和对抗例的神经元行为变异强度测量的新神经元敏感性。在本文中,我们首先利用对抗性鲁棒性和神经元灵敏度之间的密切联系,因为敏感神经元对对抗环境中的模型预测做出最不普通的贡献。基于此,我们进一步提出通过稳定敏感神经元的行为来改善对抗性鲁棒性。此外,我们证明了最先进的对抗性培训方法通过减少神经元灵敏度来改善模型鲁棒性,这反过来依然证实对抗性鲁棒性和神经元灵敏度之间的强烈连接。对各种数据集的广泛实验表明,我们的算法有效地实现了优异的效果。据我们所知,我们是第一个使用神经元敏感性研究对抗性鲁棒性的。

著录项

  • 来源
    《IEEE Transactions on Image Processing》 |2021年第1期|1291-1304|共14页
  • 作者单位

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

    State Key Laboratory of Software Development Environment Beihang University Beijing China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Neurons; Sensitivity; Robustness; Computational modeling; Analytical models; Training; Deep learning;

    机译:神经元;敏感性;鲁棒性;计算建模;分析模型;培训;深入学习;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号