How to Interpret Decision Trees?

机译：如何解释决策树？

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data mining methods are widely used across many disciplines to identify patterns, rules or associations among huge volumes of data. While in the past mostly black box methods such as neural nets and support vector machines have been heavily used in technical domains, methods that have explanation capability are preferred in medical domains. Nowadays, data mining methods with explanation capability are also used for technical domains after more work on advantages and disadvantages of the methods has been done. Decision tree induction such as C4.5 is the most preferred method since it works well on average regardless of the data set being used. This method can easily learn a decision tree without heavy user interaction while in neural nets a lot of time is spent on training the net. Cross-validation methods can be applied to decision tree induction methods; these methods ensure that the calculated error rate comes close to the true error rate. The error rate and the particular goodness measures described in this paper are quantitative measures that provide help in understanding the quality of the model. The data collection problem with its noise problem has to be considered. Specialized accuracy measures and proper visualization methods help to understand this problem. Since decision tree induction is a supervised method, the associated data labels constitute another problem. Re-labeling should be considered after the model has been learnt. This paper also discusses how to fit the learnt model to the expert's knowledge. The problem of comparing two decision trees in accordance with its explanation power is discussed. Finally, we summarize our methodology on interpretation of decision trees.

机译：数据挖掘方法已在许多学科中广泛使用，以识别大量数据之间的模式，规则或关联。过去，在技术领域中大量使用了黑匣子方法，例如神经网络和支持向量机，而在医学领域中，首选具有解释能力的方法。如今，在完成了关于方法的优缺点的更多工作之后，具有解释能力的数据挖掘方法也被用于技术领域。诸如C4.5之类的决策树归纳方法是最优选的方法，因为无论使用什么数据集，它的平均效果都很好。这种方法可以轻松地学习决策树，而无需大量的用户交互，而在神经网络中，则需要花费大量时间来训练网络。交叉验证方法可以应用于决策树归纳方法；这些方法可确保计算出的错误率接近真实错误率。本文所述的错误率和特定的优度度量是定量度量，可帮助您理解模型的质量。必须考虑数据收集问题及其噪声问题。专门的准确性度量和正确的可视化方法有助于理解此问题。由于决策树归纳是一种受监督的方法，因此相关的数据标签构成了另一个问题。在学习模型之后，应考虑重新标记。本文还讨论了如何使学习的模型适合专家的知识。讨论了根据其解释能力比较两个决策树的问题。最后，我们总结了决策树解释的方法。

著录项

来源
《Advances in data mining : Applications and theoretical aspects》|2011年|p.40-55|共16页
会议地点 New York NY(US);New York NY(US)
作者
Petra Perner;
展开▼
作者单位

Institute of Computer Vision and Applied Computer Sciences, IBal,Postbox 30 11 14, 04251, Leipzig;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. Explainable decision forest: Transforming a decision forest into an interpretable tree [J] . Information Fusion . 2020,第期

机译：解释决策森林：将决策林转化为可解释的树
2. Constructing Regression Models with High Prediction Accuracy and Interpretability Based on Decision Tree and Random Forests [J] . Naoto SHIMIZU, Hiromasa KANEKO Journal of computer chemistry . 2021,第2期

机译：基于决策树和随机林的高预测精度和解释性构建回归模型
3. Multidimensional Decision Tree Splits to Improve Interpretability [J] . Frank H?ppner Procedia Computer Science . 2020,第5期

机译：多维决策树分裂以提高可解释性
4. Could Decision Trees Improve the Classification Accuracy and Interpretability of Loan Granting Decisions? [C] . Zurada J. System Sciences (HICSS-43), 2010 . 2010

机译：决策树能否提高贷款授予决策的分类准确性和可解释性？
5. Asymptotics and Interpretability of Decision Trees and Decision Tree Ensembles [D] . Zhou, Yichen. 2019

机译：决策树和决策树集合的渐近性和可解释性
6. TNT: An Interpretable Tree-Network-Tree Learning Framework using Knowledge Distillation [O] . Jiawei Li, Yiming Li, Xingchun Xiang, 2020

机译：TNT：使用知识蒸馏的可解释树网络树学习框架
7. treeheatr: an R package for interpretable decision tree visualizations [O] . Trang T. Le, Jason H. Moore 2020

机译：TreeheaTr：可解释决策树可视化的R包
8. Two Trees: Migrating Fault Trees to Decision Trees for Real Time Fault Detection on International Space Station [R] . Lee, Charles, Alena, Richard L., Robinson, Peter 2004

机译：两棵树：将故障树迁移到决策树，实现国际空间站的实时故障检测

How to Interpret Decision Trees?

摘要

著录项

相似文献

相关主题

期刊订阅