Learning Decision Trees for Unbalanced Data

机译：学习不平衡数据的决策树

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Learning from unbalanced datasets presents a convoluted problem in which traditional learning algorithms may perform poorly. The objective functions used for learning the classifiers typically tend to favor the larger, less important classes in such problems. This paper compares the performance of several popular decision tree splitting criteria - information gain, Gini measure, and DKM - and identifies a new skew insensitive measure in Hellinger distance. We outline the strengths of Hellinger distance in class imbalance, proposes its application in forming decision trees, and performs a comprehensive comparative analysis between each decision tree construction method. In addition, we consider the performance of each tree within a powerful sampling wrapper framework to capture the interaction of the splitting metric and sampling. We evaluate over this wide range of datasets and determine which operate best under class imbalance.

机译：从不平衡的数据集学习提出了一个复杂的问题，其中传统的学习算法可能会表现不佳。用于学习分类器的目标函数通常倾向于在此类问题中偏爱较大，较不重要的类别。本文比较了几种流行的决策树划分标准（信息增益，Gini度量和DKM）的性能，并确定了Hellinger距离中一种新的偏斜不敏感度量。我们概述了Hellinger距离在类不平衡中的优势，提出了其在形成决策树中的应用，并对每种决策树的构造方法进行了全面的比较分析。此外，我们考虑了功能强大的采样包装器框架中每棵树的性能，以捕获拆分指标和采样的相互作用。我们在广泛的数据集上进行评估，并确定在类不平衡情况下哪个操作效果最佳。

著录项

来源
《European Conference on Machine Learning and Knowledge Discovery in Databases;ECML PKDD 2008》|2008年|P.241-256|共16页
会议地点
作者
David A. Cieslak; Nitesh V. Chawla;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;TP11.13;
关键词

相似文献

外文文献
中文文献
专利

1. Student modeling method using decision tree learning based on portfolio concept - structure of learning history data-based and condensing method using decision tree learning [J] . Tatsunori Matsui, Toshio Okamoto 電子情報通信学会技術研究報告. 情報セキュリティ. Information Security . 2000,第113期

机译：基于投资组合概念的学生建模方法 - 基于学习历史数据的结构基于学习历史数据的结构，使用决策树学习
2. Student modeling method using decision tree learning based on portfolio concept - structure of learning history data-based and condensing method using decision tree learning [J] . Tatsunori Matsui, Toshio Okamoto 電子情報通信学会技術研究報告. 情報セキュリティ. Information Security . 2000,第113期

机译：基于投资组合概念的学生建模方法 - 基于学习历史数据的结构基于学习历史数据的结构，使用决策树学习
3. Automatic Transformation of Raw Clinical Data Into Clean Data Using Decision Tree Learning Combining with String Similarity Algorithm [J] . Jian Zhang OASIcs : OpenAccess Series in Informatics . 2015,第1期

机译：结合决策树学习和字符串相似度算法，将原始临床数据自动转换为干净数据
4. Learning Decision Trees for Unbalanced Data [C] . David A. Cieslak, Nitesh V. Chawla European Conference on Machine Learning and Knowledge Discovery in Databases . 2008

机译：学习非平衡数据的决策树
5. Privacy preservation for training datasets in database: Application to decision tree learning. [D] . Fong, Pui Kuen. 2009

机译：数据库中训练数据集的隐私保护：在决策树学习中的应用。
6. A Framework for Learning from Distributed Data Using Sufficient Statistics and its Application to Learning Decision Trees [O] . Doina Caragea, Adrian Silvescu, Vasant Honavar -1

机译：利用足够的统计数据从分布式数据中学习的框架及其在学习决策树中的应用
7. Learning Decision Trees for Unbalanced Data [O] . David A. Cieslak, Nitesh V. Chawla 2008

机译：学习不平衡数据的决策树

Learning Decision Trees for Unbalanced Data

摘要

著录项

相似文献

相关主题

期刊订阅