首页> 外文学位 >Towards Explainable Artificial Intelligence: Interpreting Neural Network Classifiers with Probabilistic Prime Implicants =Zu Erkl?rbarer Künstlicher Intelligenz

【24h】

Towards Explainable Artificial Intelligence: Interpreting Neural Network Classifiers with Probabilistic Prime Implicants =Zu Erkl?rbarer Künstlicher Intelligenz

机译：迈向可解释的人工智能：使用概率素数隐含解释神经网络分类器 =Zu erkl？rbarer Künstlicher Intelligenz

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相关主题

摘要

In this thesis we develop a framework for interpreting the decisions of highly nonlinear classifer functions with a focus on neural networks. Specifcally, we formalise the idea of separating the input parameters into relevant and irrelevant ones as an explicit optimisation problem.First, we describe what is generally understood as a relevance map for classifers and give an overview over the existing methods to produce such maps. We explain how we used relevance maps to detect artefacts in the PASCAL VOC dataset and track the focus of neural agents playing the Atari video games Breakout and Pinball on a human-like level.Towards a formal defnition of relevance maps, we generalise the concept of prime implicants from abductive logic to a probabilistic setting by introducing δ-relevant sets. For a d-ary Boolean function Φ: {0, 1} d → {0, 1} and an assignment to its variables x = (x1, x2, . . . , xd) we consider the problem of fnding those subsets of the variables that are sufcient to determine the function output with a given probability δ. We show that the problem of fnding small δ-relevant sets is NP-hard to approximate with a factor d 1?α for α > 0. The associated decision problem turns out to be NPPP-complete.We further generalise δ-relevant sets from the binary to the continuous domain. This leads naturally to a rate-distortion trade-of between the size of the δ-relevant set (rate) and the change in the classifer prediction (distortion). Relevance maps can then be interpreted as greedy approximations of the rate-distortion function. Evaluating this function even approximately turns out to be NP-hard, so we develop a heuristic solution strategy based convex relaxation of the combinatorial problem and assumed density fltering (ADF) for deep ReLU neural networks. This results in our own explanation method which we call Rate-Distortion Explanations (RDE).To show that the approximations in ADF are necessary, we give a complete characterisation of families of probability distributions that are invariant under the action of ReLU neural network layers. We demonstrate that the only invariant families are either degenerate or amount to sampling.Subsequently, we propose and discuss several benchmark tests and numerical evaluation methods for relevance maps. We compare RDE to a representative collection of established relevance methods and demonstrate that it outperforms competitors for a wide range of tasks.Finally, we discuss how the knowledge of the true data distribution is crucial for any existing explanation method. We criticise our own method over potential artefacts and introduce a stronger, information theoretical requirement based on the conditional entropy. A novel approach, called Arthur-Merlin-regularisation along with a new framework is developed. The framework is then extended to realistic algorithms and data sets, and we discuss under which assumptions the guarantees still hold.

机译：在这篇论文中，我们开发了一个框架来解释高度非线性分类函数的决策，重点是神经网络。具体来说，我们将将输入参数分为相关和不相关的想法正式化为一个明确的优化问题。首先，我们描述了通常被理解为分类器的相关性图，并概述了生成此类图的现有方法。我们解释了如何使用相关性图来检测 PASCAL VOC 数据集中的伪影，并在类似人类的层面上跟踪玩 Atari 视频游戏 Breakout 和 Pinball 的神经代理的焦点。为了正式定义相关性映射，我们通过引入δ相关集合，将质隐含的概念从归纳逻辑推广到概率设置。对于 d-ary 布尔函数 Φ： {0， 1} d → {0， 1} 并分配给其变量 x = （x1， x2， . . . ， xd），我们考虑将那些足以确定具有给定概率δ的函数输出的变量子集的问题。我们表明，对于α > 0，使用因子 d 1？α 来说，δ相关集的 fnding 问题是 NP 难以近似的。我们进一步将 δ 相关集合从二进制推广到连续域。这自然会导致 δ 相关集的大小（率）和分类器预测的变化（失真）之间的速率失真交换。然后，可以将相关性映射解释为 rate-distortion 函数的贪婪近似值。评估这个函数甚至近似地证明是 NP 困难的，因此我们为深度 ReLU 神经网络开发了一种基于组合问题的凸松弛和假设密度闪烁（ADF）的启发式求解策略。这导致了我们自己的解释方法，我们称之为 Rate-Distortion Explanations （RDE）。为了证明 ADF 中的近似值是必要的，我们给出了在 ReLU 神经网络层的作用下不变的概率分布族的完整特征。我们证明了唯一的不变族要么是退化的，要么是采样的。随后，我们提出并讨论了相关性图的几种基准测试和数值评估方法。我们将 RDE 与一组具有代表性的已建立相关性方法进行了比较，并证明它在广泛的任务上优于竞争对手。最后，我们讨论了真实数据分布的知识对于任何现有的解释方法来说都是至关重要的。我们批评了我们自己的方法而不是潜在的人工制品，并引入了一个基于条件熵的更强大的信息论要求。开发了一种称为 Arthur-Merlin 正则化的新方法以及一个新框架。然后，该框架扩展到现实算法和数据集，我们讨论了在哪些假设下保证仍然成立。

著录项

作者
W?ldchen, Stephan.;
展开▼
作者单位

Technische Universitaet Berlin (Germany).;

Technische Universitaet Berlin (Germany).;

Technische Universitaet Berlin (Germany).;

展开▼
授予单位 Technische Universitaet Berlin (Germany).;Technische Universitaet Berlin (Germany).;Technische Universitaet Berlin (Germany).;
学科 Neural networks.;Medical research.;Game theory.;Maps.;Feature selection.;Algorithms.
学位
年度 2022
页码 207
总页数 207
原文格式 PDF
正文语种 eng
中图分类
关键词
Neural networks.; Medical research.; Game theory.; Maps.; Feature selection.; Algorithms.;

机译：神经网络。;医学研究。;博弈论。;地图。;功能选择。;算法。;

Towards Explainable Artificial Intelligence: Interpreting Neural Network Classifiers with Probabilistic Prime Implicants =Zu Erkl?rbarer Künstlicher Intelligenz

摘要

著录项

引文网络

相关主题

期刊订阅