【24h】

Learning to Attack: Adversarial Transformation Networks

机译:学习攻击:对抗转型网络

获取原文

摘要

With the rapidly increasing popularity of deep neural networks for image recognition tasks, a parallel interest in generating adversarial examples to attack the trained models has arisen. To date, these approaches have involved either directly computing gradients with respect to the image pixels or directly solving an optimization on the image pixels. We generalize this pursuit in a novel direction: can a separate network be trained to efficiently attack another fully trained network? We demonstrate that it is possible, and that the generated attacks yield startling insights into the weaknesses of the target network. We call such a network an Adversarial Transformation Network (ATN). ATNs transform any input into an adversarial attack on the target network, while being minimally perturbing to the original inputs and the target network's outputs. Further, we show that ATNs are capable of not only causing the target network to make an error, but can be constructed to explicitly control the type of misclassification made. We demonstrate ATNs on both simple MNIST-digit classifiers and state-of-the-art ImageNet classifiers deployed by Google, Inc.: Inception ResNet-v2.
机译:随着对图像识别任务的深度神经网络的普及迅速增加,已经出现了产生对抗攻击训练模型的对抗性示例的平行兴趣。迄今为止,这些方法已经涉及直接计算梯度相对于图像像素或直接解决图像像素上的优化。我们以小说方向推广这一追求:可以培训单独的网络,以有效地攻击另一个完全训练的网络吗?我们证明它是可能的,并且产生的攻击可能会对目标网络的弱点产生惊人的洞察力。我们称这种网络是对抗性转换网络(ATN)。 ATN将任何输入转换为对目标网络的对策攻击,同时对原始输入和目标网络的输出最小地扰乱。此外,我们表明ATNS不仅能够导致目标网络出错,而且可以构建以明确控制所做的错误分类类型。我们在简单的Mnist-digit分类器和由Google,Inc.: inception resnet-v2部署的简单Mnist-digit分类器和最先进的ImageNet分类器上的ATYS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号