Evaluation of neural networks defenses and attacks using NDCG and reciprocal rank metrics

Brama Haya; Dery Lihi; Grinshpoun Tal

摘要

The problem of attacks on neural networks through input modification (i.e., adversarial examples) has attracted much attention recently. Being relatively easy to generate and hard to detect, these attacks pose a security breach that many suggested defenses try to mitigate. However, the evaluation of the effect of attacks and defenses commonly relies on traditional classification metrics, without adequate adaptation to adversarial scenarios. Most of these metrics are accuracy-based and therefore may have a limited scope and low distinctive power. Other metrics do not consider the unique characteristics of neural network functionality or measure the effectiveness of the attacks indirectly (e.g., through the complexity of their generation). In this paper, we present two metrics that are specifically designed to measure the effect of attacks, or the recovery effect of defenses, on the output of neural networks in multiclass classification tasks. Inspired by the normalized discounted cumulative gain and the reciprocal rank metrics used in information retrieval literature, we treat the neural network predictions as ranked lists of results. Using additional information about the probability of the rank enabled us to define novel metrics that are suited to the task at hand. We evaluate our metrics using various attacks and defenses on a pre-trained VGG19 model and the ImageNet dataset. Compared to the common classification metrics, our proposed metrics demonstrate superior informativeness and distinctiveness.

机译：通过输入修改（即对抗性示例）对神经网络进行攻击的问题最近引起了人们的广泛关注。这些攻击相对容易生成且难以检测，因此会造成许多建议的防御措施试图缓解的安全漏洞。然而，对攻击和防御效果的评估通常依赖于传统的分类指标，而没有充分适应对抗性场景。这些指标中的大多数都是基于准确性的，因此可能具有有限的范围和较低的独特功效。其他指标不考虑神经网络功能的独特特征，也不间接衡量攻击的有效性（例如，通过其生成的复杂性）。在本文中，我们提出了两个专门用于衡量攻击的影响或防御的恢复效果的指标，这些指标对多类分类任务中神经网络输出的影响。受信息检索文献中使用的归一化贴现累积增益和倒数秩指标的启发，我们将神经网络预测视为结果的排名列表。使用有关排名概率的附加信息，我们能够定义适合手头任务的新指标。我们在预先训练的 VGG19 模型和 ImageNet 数据集上使用各种攻击和防御来评估我们的指标。与常见的分类指标相比，我们提出的指标表现出卓越的信息性和独特性。

Evaluation of neural networks defenses and attacks using NDCG and reciprocal rank metrics

摘要

著录项

引文网络

相关主题

期刊订阅