Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

机译：精细修剪：防御深度神经网络的后门攻击

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.

机译：深度神经网络（DNN）可在各种分类任务中提供出色的性能，但是其训练需要大量的计算资源，并且通常外包给第三方。最近的工作表明，外包培训会带来这样的风险，即恶意培训者会返回后门DNN，该后门DNN在大多数输入上均正常运行，但当存在仅攻击者已知的触发器时，会导致目标错误的分类或降低网络的准确性。在本文中，我们提供了针对DNN的后门攻击的首个有效防御措施。我们从先前的工作中实施了三个后门攻击，并使用它们来研究两个有希望的防御措施，即修剪和微调。我们证明，仅靠它们本身，不足以抵御复杂的攻击者。然后，我们评估精细修剪（修剪和精细调整的组合），并表明它成功地削弱甚至消除了后门，即在某些情况下，将攻击成功率降低到0％，而准确率仅下降0.4％清洁（非触发）输入。我们的工作为防御深度神经网络中的后门攻击提供了第一步。

著录项

来源
《International symposium on research in attacks, intrusions and defenses》|2018年|273-294|共22页
会议地点
作者
Kang Liu; Brendan Dolan-Gavitt; Siddharth Garg;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep learning; Backdoor; Trojan; Pruning; Fine-tuning;

机译：深度学习;后门;木马;修剪;微调;

相似文献

外文文献
中文文献
专利

1. An adversarial attack detection method in deep neural networks based on re-attacking approach [J] . Ahmadi Morteza Ali, Dianat Rouhollah, Amirkhani Hossein Multimedia Tools and Applications . 2021,第7期

机译：基于重新攻击方法的深神经网络对侵扰攻击检测方法
2. Assessing the Threat of Adversarial Examples on Deep Neural Networks for Remote Sensing Scene Classification: Attacks and Defenses [J] . Yonghao Xu, Bo Du, Liangpei Zhang IEEE Transactions on Geoscience and Remote Sensing. . 2021,第2期

机译：评估对遥感场景分类深度神经网络的对抗例的威胁：攻击和防御
3. Universal adversarial attacks on deep neural networks for medical image classification [J] . Hokuto Hirano, Akinori Minagi, Kazuhiro Takemoto BMC Medical Imaging . 2021,第1期

机译：对医学图像分类深神经网络的普遍对抗攻击
4. Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks [C] . Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg International Symposium on Research in Attacks, Intrusions, and Defenses . 2018

机译：细长修剪：防止深层神经网络的回溯攻击
5. Understanding and Mitigating the Impact of Backdooring Attacks on Deep Neural Networks [D] . Liu, Kang. 2021

机译：理解和减轻回溯攻击对深神经网络的影响
6. A Spatiotemporal-Oriented Deep Ensemble Learning Model to Defend Link Flooding Attacks in IoT Network [O] . Yen-Hung Chen, Yuan-Cheng Lai, Pi-Tzong Jan, 2021

机译：一种时空的深度集成学习模型用于捍卫IOT网络中的链接洪水攻击
7. BadNets: Evaluating Backdooring Attacks on Deep Neural Networks [O] . Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, 2019

机译：Badnets：评估深度神经网络的回间攻击

Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅