首页> 外文会议>International Conference on Information and Knowledge Technology >Induction of decision trees by looking to data sequentially and using error correction rule
【24h】

Induction of decision trees by looking to data sequentially and using error correction rule

机译:通过顺序地并使用纠错规则来诱导决策树并使用纠错规则

获取原文

摘要

Decision trees are common algorithms in machine learning. Traditionally, these algorithms make trees recursively and at each step, they inspect data to induce the part of the tree. However decision trees are famous for their instability and high variance in error. In this paper a solution which adds error correction rule to a traditional decision tree algorithm is examined. In fact an algorithm which we call it, ECD3 is introduced. Algorithm of ECD3 inspects data sequentially in an iterative manner and updates tree only when it finds an erroneous observation. This method was first proposed by Dr. Utgoff but not implemented. In this paper, the method is developed and several experiments are performed to evaluate the method. We found that in most cases, performance of ECD3 is comparable to its predecessors. However ECD3 has some benefits over them. First, sizes of its trees are significantly smaller. Second, on average, variance of error in ECD3 is lower. Furthermore, ECD3 automatically chooses part of data for induction of the tree and sets aside others. This capability can be exploited for prototype selection in various learning algorithms. To explain these observations, we use inductive bias and margin definitions in our theories. We introduce a new definition of margin in ordinary decision trees based on shape, size and splitting criteria in trees. We show that how ECD3 expands the margins and enhances precision over test data.
机译:决策树是机器学习中的常见算法。传统上,这些算法递归和在每个步骤中,他们检查数据以诱导树的一部分。然而,决策树以其不稳定性和误差的高方差而闻名。在本文中,检查了一种对传统决策树算法添加纠错规则的解决方案。实际上,我们称之为的算法,ECD3被引入。 ECD3的算法以迭代方式顺序地检查数据,仅在发现错误观察时更新树。该方法首先由Utgoff博士提出但未实施。在本文中,开发了该方法,进行了几个实验以评估方法。我们发现在大多数情况下,ECD3的性能与其前辈相当。然而,ECD3对它们有一些好处。首先,其树木的尺寸明显更小。其次,平均而言,ECD3中的误差方差较低。此外,ECD3自动选择诱导树的一部分数据,并将其放在一边。可以在各种学习算法中利用这种能力进行原型选择。为了解释这些观察,我们在我们的理论中使用归纳偏见和保证金定义。我们基于树木的形状,大小和分裂标准在普通决策树中引入了普通决策树的新定义。我们展示了ECD3如何扩展边缘并增强测试数据的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号