SPLITTING METHODS FOR DECISION TREE INDUCTION:A COMPARISON OF TWO FAMILIES

机译：决策树诱导的分裂方法：两个家族的比较

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Decision tree (DT) induction is among the more popular of the data mining techniques. An importantrncomponent of DT induction algorithms is the splitting method, with the most commonly used method beingrnbased on the conditional entropy family. However, it is well known that there is no single splitting method thatrnwill give the best performance for all problem instances. In this paper we explore the relative performancernConditional Entropy family and another family that is based on the Class-Attribute Mutual Information (CAMI)rnmeasure. Our results suggest that while some datasets are insensitive to the choice of splitting methods, otherrndatasets are very sensitive to the choice of splitting methods. For example, some of the CAMI family methodsrnmay be more appropriate than GainRatio (GR) for datasets where all non-class attributes are nominal; somernof the CAMI methods perform as well as GR for datasets where all the non-class attributes are either integerrnor continuous. Given the fact that it is never known beforehand which splitting method will lead to the best DTrnfor the given dataset, and given the relatively good performance of the CAMI methods, it seems appropriaternto suggest that splitting methods from the CAMI family should be included in data mining toolsets.

机译：决策树（DT）归纳法是最流行的数据挖掘技术之一。 DT归纳算法的一个重要组成部分是分裂方法，最常用的方法是基于条件熵族。但是，众所周知，没有一种拆分方法可以为所有问题实例提供最佳性能。在本文中，我们探讨了相对性能-条件熵家族和另一个基于类属性互信息（CAMI）度量的家族。我们的结果表明，尽管某些数据集对拆分方法的选择不敏感，但其他数据集对拆分方法的选择却非常敏感。例如，对于所有非类别属性都是标称的数据集，某些CAMI族方法可能比GainRatio（GR）更合适；对于所有非类属性都是整数或连续的数据集，CAMI方法的性能和GR一样好。鉴于以前从未知道过哪种分割方法会导致给定数据集获得最佳DTrn的事实，并且鉴于CAMI方法的性能相对较好，因此似乎有必要建议将CAMI系列的分割方法包括在数据挖掘中工具集。

著录项

来源
《Association for Information Systems 8th Americas conference on information systems (AMCIS 2002)》|2002年|p.55-61|共7页
会议地点 Dallas TX(US);Dallas TX(US)
作者
Kweku-Muata Osei-Bryson; Kendall Giles;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;管理信息系统;
关键词
入库时间 2022-08-26 14:27:32

相似文献

外文文献
中文文献
专利

1. Splitting methods for decision tree induction: An exploration of the relative performance of two entropy-based families [J] . Kweku-Muata Osei-Bryson, Kendall Giles Information systems frontiers . 2006,第3期

机译：决策树归纳的拆分方法：两个基于熵的族的相对性能的探索
2. Classification of pyrolysis mass spectra by fuzzy multivariate rule induction-comparison with regression, K-nearest neighbour, neural and decision-tree methods [J] . Alsberg BK, Goodacre R, Rowland JJ, Analytica chimica acta . 1997,第1a3期

机译：通过回归，K最近邻法，神经网络和决策树方法对多元热分解法进行热分解质谱分类
3. Decision tree induction using a fast splitting attribute selection for large datasets [J] . A. Franco-Arcega, J.A. Carrasco-Ochoa, G. Sanchez-Diaz, Expert Systems with Application . 2011,第11期

机译：使用大型数据集的快速拆分属性选择进行决策树归纳
4. SPLITTING METHODS FOR DECISION TREE INDUCTION: A COMPARISON OF TWO FAMILIES [C] . Association for Information Systems Americas conference on information systems . 2002

机译：决策树诱导分裂方法：两个家庭的比较
5. An empirical comparison of logistic regression to decision tree induction in the prediction of intimate partner violence reassault. [D] . Brewer, Steven L., Jr. 2012

机译：Logistic回归与决策树归纳在亲密伴侣暴力再攻击预测中的经验比较。
6. Prevalence and Determinants of Preterm Birth in Tehran Iran: A Comparison between Logistic Regression and Decision Tree Methods [O] . Payam Amini, Saman Maroufizadeh, Reza Omani Samani, 2017

机译：伊朗德黑兰的早产流行率和决定因素：逻辑回归与决策树方法的比较
7. A further comparison of splitting rules for decision-tree induction [O] . Wray Buntine, Tim Niblett 1992

机译：决策诱导分裂规则的进一步比较
8. Rough Set Based Splitting Criterion for Binary Decision Tree Classifiers. [R] . Mikulski, D. G. 2006

机译：基于粗糙集的二元决策树分类器分裂准则。

SPLITTING METHODS FOR DECISION TREE INDUCTION:A COMPARISON OF TWO FAMILIES

摘要

著录项

相似文献

相关主题

期刊订阅