Top-down induction of decision trees: rigorous guarantees and inherent limitations

Guy Blanc; Jane Lange; Li-Yang Tan

首页> 外文期刊>Electronic Colloquium on Computational Complexity >Top-down induction of decision trees: rigorous guarantees and inherent limitations

【24h】

Top-down induction of decision trees: rigorous guarantees and inherent limitations

机译：自上而下的决策树归纳法：严格的保证和固有的局限性

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Consider the following heuristic for building a decision tree for a function f : 0 1 n 1 . Place the most influential variable x i of f at the root, and recurse on the subfunctions f x i =0 and f x i =1 on the left and right subtrees respectively; terminate once the tree is an -approximation of f . We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds: Upper bound: For every f with decision tree size s and every ( 0 2 1 ) , this heuristic builds a decision tree of size at most s O ( log ( s ) log (1 )) . Lower bound: For every ( 0 2 1 ) and s 2 O ( n ) , there is an f with decision tree size s such that this heuristic builds a decision tree of size s ( log s ) . We also obtain upper and lower bounds for monotone functions: s O ( log s ) and s ( 4 log s ) respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009).Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms---which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989), and even have certain qualitative advantages. Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime.

机译：考虑以下为函数f构造决策树的启发式方法：0 1 n 1。将影响力最大的变量x i放在根上，分别递归到左子树和右子树上的子函数f x i = 0和f x i = 1;一旦树是f的近似值，终止。我们分析此启发式算法的质量，获得接近匹配的上下限：上限：对于决策树大小为s的每个f和每个（0 2 1），此启发式方法最多构建一个决策树，大小为s O（log （s）日志（1））。下界：对于每个（0 2 1）和s 2 O（n），都有一个决策树大小为s的f，使得该启发式方法建立了大小为s（log s）的决策树。我们还获得了单调函数的上限和下限：分别为s O（log s）和s（4 log s）。下界反驳了Fiat和Pechyony（2004）和Lee（2009）的猜想。我们的上限产生了在均匀分布下正确学习决策树的新算法。我们证明了这些算法-由ID3，C4.5和CART等广泛采用的，经验丰富的自上而下的成功决策树学习启发法所推动-实现了可证明的保证，与当前最快的算法相比具有可比性（Ehrenfeucht and Haussler，1989），甚至具有一定的质量优势。我们的下限为这些启发式方法的局限性提供了新的思路。最后，我们回顾一下Ehrenfeucht和Haussler的经典作品。我们对其进行扩展，以提供第一个均匀分布的适当学习算法，该算法可实现多项式样本和内存复杂性，同时匹配其最新的准多项式运行时。

著录项

来源
《Electronic Colloquium on Computational Complexity》 |2019年第118期|共47页
作者
Guy Blanc; Jane Lange; Li-Yang Tan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
decision treesinfluencePAC learning;

机译：决策树对PAC学习的影响;

相似文献

外文文献
中文文献
专利

1. Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations [J] . Guy Blanc, Jane Lange, Li-Yang Tan LIPIcs : Leibniz International Proceedings in Informatics . 2020,第30期

机译：决策树上的自上而下的归纳：严格的保证和固有的限制
2. Top-down induction of decision trees classifiers - a survey [J] . Rokach L., Maimon O. IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews . 2005,第4期

机译：自上而下的决策树分类器归纳-调查
3. Capturing knowledge through top-down induction of decision trees [J] . Gray N.A.B. IEEE Expert . 1990,第3期

机译：通过自上而下的决策树归纳来获取知识
4. A learnability model for universal representations and its application to top-down induction of decision trees [C] . Stephen Muggleton, David Page Workshop on Machine Intelligence . 1999

机译：通用表示的可读性模型及其在自上而下诱导决策树的应用
5. A new approach of top-down induction of decision trees for knowledge discovery. [D] . Lee, Jun-Youl. 2008

机译：自上而下归纳决策树以进行知识发现的新方法。
6. A Decision-Tree Approach to the Assessment of Posttraumatic Stress Disorder: Engineering Empirically Rigorous and Ecologically Valid Assessment Measures [O] . Regan W. Stewart, Peter W. Tuerk, Isha W. Metzger, -1

机译：创伤后应激障碍评估的决策树方法：工程经验严格和生态有效的评估方法
7. A new approach of top-down induction of decision trees for knowledge discovery [O] . Lee, Jun-Youl 2008

机译：自顶向下归纳决策树进行知识发现的新方法

Top-down induction of decision trees: rigorous guarantees and inherent limitations

摘要

著录项

相似文献

相关主题

期刊订阅