【24h】

Interpreting Neural Networks with Nearest Neighbors

机译:解释与最近邻居的神经网络

获取原文
获取原文并翻译 | 示例

摘要

Local model interpretation methods explain individual predictions by assigning an importance value to each input feature. This value is often determined by measuring the change in confidence when a feature is removed. However, the confidence of neural networks is not a robust measure of model uncertainty. This issue makes reliably judging the importance of the input features difficult. We address this by changing the test-time behavior of neural networks using Deep k-Nearest Neighbors. Without harming text classification accuracy, this algorithm provides a more robust uncertainty metric which we use to generate feature importance values. The resulting interpretations better align with human perception than baseline methods. Finally, we use our interpretation method to analyze model predictions on dataset annotation artifacts.
机译:局部模型解释方法通过为每个输入要素分配重要性值来解释各个预测。该值通常是通过测量删除特征时的置信度变化来确定的。但是,神经网络的置信度并不是模型不确定性的可靠度量。该问题使得难以可靠地判断输入特征的重要性。我们通过使用Deep k-Nearest邻居更改神经网络的测试时间行为来解决此问题。在不损害文本分类准确性的情况下,该算法提供了更可靠的不确定度度量,我们可以使用该度量来生成特征重要性值。由此产生的解释比基线方法更符合人类的感知。最后,我们使用解释方法来分析关于数据集注释工件的模型预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号