White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks

机译：白对黑色：高效蒸馏黑匣子对抗攻击

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Adversarial examples are important for understanding the behavior of neural models, and can improve their robustness through adversarial training. Recent work in natural language processing generated adversarial examples by assuming white-box access to the attacked model, and optimizing the input directly against it (Ebrahimi et al., 2018). In this work, we show that the knowledge implicit in the optimization procedure can be distilled into another more efficient neural network. We train a model to emulate the behavior of a white-box attack and show that it generalizes well across examples. Moreover, it reduces adversarial example generation time by 19x-39x. We also show that our approach transfers to a black-box setting, by attacking The Google Perspective API and exposing its vulnerability. Our attack flips the API-predicted label in 42% of the generated examples, while humans maintain high-accuracy in predicting the gold label.

机译：对抗性实例对于了解神经模型的行为是重要的，并且可以通过对抗性培训来提高他们的鲁棒性。最近在自然语言处理的工作通过假设白盒访问攻击模型，并直接优化输入（Ebrahimi等，2018）。在这项工作中，我们表明可以将所隐含的信息中隐含的知识蒸馏到另一个更高效的神经网络。我们训练一个模型来模拟白盒攻击的行为，并表明它贯穿于示例中的概括。此外，它通过19x-39x减少了对抗性示例生成时间。我们还显示我们的方法通过攻击Google Perspective API并暴露其漏洞来传输到黑匣子设置。我们的攻击将API预测标签翻转在所生成的例子的42％，而人类在预测金标签方面保持高精度。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2019年|xciii p. 701-1400|共7页
会议地点
作者
Yotam Gil; Yoav Chai; Or Gorodissky; Jonathan Berant;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. attackGAN: Adversarial Attack against Black-box IDS using Generative Adversarial Networks [J] . Shuang Zhao, Jing Li, Jianmin Wang, Procedia Computer Science . 2021,第a期

机译：通过生成的对抗网络，进攻：对黑匣子ID的对抗攻击
2. Adversarial Black-Box Attacks with Timing Side-Channel Leakage [J] . Nakai Tsunato, Suzuki Daisuke, Omatsu Fumio, IEICE Transactions on fundamentals of electronics, communications & computer sciences . 2021,第1期

机译：对抗与时序侧通道泄漏的对抗黑匣子攻击
3. A low-query black-box adversarial attack based on transferability [J] . Ding Kangyi, Liu Xiaolei, Niu Weina, Knowledge-Based Systems . 2021,第Auga17期

机译：基于可转移性的低查询黑盒对抗攻击
4. White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks [C] . Yotam Gil, Yoav Chai, Or Gorodissky, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：白变黑：黑盒对抗攻击的有效蒸馏
5. Query-Efficient Black-Box Adversarial Attacks [D] . Singh, Simranjit. 2020

机译：查询高效的黑匣子逆势攻击
6. DReLAB - Deep REinforcement Learning Adversarial Botnet: A benchmark dataset for adversarial attacks against botnet Intrusion Detection Systems [O] . Andrea Venturi, Giovanni Apruzzese, Mauro Andreolini, 2021

机译：DRELAB - 深度加强学习对抗僵尸网络：用于对僵尸网络入侵检测系统进行对抗性攻击的基准数据集
7. Amora: Black-box Adversarial Morphing Attack [O] . Run Wang, Felix Juefei-Xu, Qing Guo, 2020

机译：Amora：黑匣子对抗变形攻击

White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅