首页> 外文会议>IEEE International Conference on Image Processing >Model-Agnostic Adversarial Example Detection Through Logit Distribution Learning
【24h】

Model-Agnostic Adversarial Example Detection Through Logit Distribution Learning

机译:通过Logit分布学习的模型 - 无症侵犯示例示例检测

获取原文

摘要

Recent research on vision-based tasks has achieved great improvement due to the development of deep learning solutions. However, deep models have been found vulnerable to adversarial attacks where the original inputs are maliciously manipulated and cause dramatic shifts to the outputs. In this paper, we focus on adversarial attacks in image classifiers built with deep neural networks and propose a model-agnostic approach to detect adversarial inputs. We argue that the logit semantics of adversarial inputs follow a different evolution with respect to original inputs, and construct a logits-based embedding of features for effective representation learning. We train an LSTM network to further analyze the sequence of logits-based features to detect adversarial examples. Experimental results on the MNIST, CFAR-10, and CFAR-100 datasets show that our method achieves state-of-the-art accuracy for detecting adversarial examples and has strong generalizability.
机译:由于深度学习解决方案的发展,最近基于视觉的任务的研究取得了很大的改善。 然而,已经发现深层模型容易受到对抗的侵袭,其中原始投入的恶意操纵并导致输出剧烈移位。 在本文中,我们专注于使用深神经网络构建的图像分类器中的对抗性攻击,并提出了一种检测对抗性投入的模型 - 不可知方法。 我们认为对抗性输入的Logit语义遵循不同的进化以及原始输入,并构建基于Logits的特征嵌入,以获得有效的表示学习。 我们训练LSTM网络以进一步分析基于基于逻辑的特征序列以检测对抗性示例。 MNIST,CFAR-10和CFAR-100数据集上的实验结果表明,我们的方法实现了最先进的准确性,用于检测对抗性实例并具有强的相互性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号