首页> 外文会议>Data Mining Workshops, ICDMW, 2008 IEEE International Conference on >Chi-Square Test Based Decision Trees Induction in Distributed Environment
【24h】

Chi-Square Test Based Decision Trees Induction in Distributed Environment

机译:分布式环境中基于卡方检验的决策树归纳

获取原文
获取外文期刊封面目录资料

摘要

The decision tree-based classification is a popular approach for pattern recognition and data mining. Most decision tree induction methods assume training data being present at one central location. Given the growth in distributed databases at geographically dispersed locations, the methods for decision tree induction in distributed settings are gaining importance. This paper describes one distributed learning algorithm which extends the original(centralized) CHAID algorithm to its distributed version. This distributed algorithm generates exactly the same results as its centralized counterpart. For completeness, a distributed quantization method is proposed so that continuous data can be processed by our algorithm. Experimental results for several well known data sets are presented and compared with decision trees generated using CHAID with centrally stored data.
机译:基于决策树的分类是一种流行的模式识别和数据挖掘方法。大多数决策树诱导方法假设在一个中心位置存在训练数据。鉴于地理上分散位置的分布式数据库的增长,分布式环境中的决策树诱导方法越来越重要。本文介绍了一个分布式学习算法,它将原始(集中式)CHAID算法扩展到其分布式版本。该分布式算法会产生与其集中式相同的结果完全相同的结果。为了完整性,提出了一种分布式量化方法,以便可以通过我们的算法处理连续数据。呈现了几种众所周知的数据集的实验结果,并与使用CHAID与中央存储数据产生的决策树进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号