首页> 外文期刊>Intelligent decision technologies >From centralized to distributed decision tree induction using CHAID and fisher's linear discriminant function algorithms
【24h】

From centralized to distributed decision tree induction using CHAID and fisher's linear discriminant function algorithms

机译:使用CHAID和Fisher线性判别函数算法从集中式到分布式决策树归纳

获取原文
获取原文并翻译 | 示例
       

摘要

The decision tree-based classification is a popular approach for pattern recognition and data mining. Most decision tree induction methods assume training data being present at one central location. Given the growth in distributed databases at geographically dispersed locations, the methods for decision tree induction in distributed settings are gaining importance. This paper extends two well-known decision tree methods for centralized data to distributed data settings. The first method is an extension of CHAID algorithm and generates single feature based multi-way split decision trees. The second method is based on Fisher's linear discriminant (FLD) function and generates multifeature binary trees. Both methods aim to generate compact trees and are able to handle multiple classes. The suggested extensions for distributed environment are compared to their centralized counterparts and also to each other. Theoretical analysis and experimental tests demonstrate the effectiveness of the extensions. In addition, the side-by-side comparison highlights the advantages and deficiencies of these methods under different settings of the distribution environments.
机译:基于决策树的分类是用于模式识别和数据挖掘的流行方法。大多数决策树归纳方法都假设训练数据位于一个中心位置。随着分布在地理位置分散的分布式数据库的增长,在分布式环境中进行决策树归纳的方法变得越来越重要。本文将用于集中式数据的两种著名的决策树方法扩展到分布式数据设置。第一种方法是CHAID算法的扩展,并生成基于单特征的多路拆分决策树。第二种方法基于Fisher线性判别(FLD)函数,并生成多功能的二叉树。两种方法都旨在生成紧凑的树,并且能够处理多个类。将针对分布式环境的建议扩展与它们的集中式对等以及相互之间进行比较。理论分析和实验测试证明了扩展的有效性。此外,并排比较突出了这些方法在分发环境的不同设置下的优点和缺点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号