首页> 外文会议>International Conference on Recent Trends in Information Technology >A High Speed Decision Tree Classifier Algorithm for Huge Dataset
【24h】

A High Speed Decision Tree Classifier Algorithm for Huge Dataset

机译:大型数据集的高速决策树分类器算法

获取原文

摘要

Knowledge discovery is an important tool for the intelligent business to transform data into useful information that will increase the business revenue. Data mining techniques support automatic exploration of data, and attempts to classify the patterns and trends in data, and also infer decision rules from those patterns. Classification of dataset is an important function of mining which is a supervised machine learning procedure. Scalability and efficiency of the classifier algorithm becomes a major issue of concern when we use a large dataset and requires more number of dataset parsing. In this paper, we present a scalable decision tree algorithm for classifying large dataset with high processing speed, which requires only one scan over the dataset. It overcomes the drawback of RainForest algorithm which addresses the scalability issue and requires a pass over the dataset in each level of decision tree construction. The proposed algorithm significantly reduces the IO cost and also requires one time sorting for numerical attributes which leads to a better performance in time dimension. According to the experimental results, our algorithm acquires less execution time over the RainForest algorithm and also adoptable for any attribute selection method by which the accuracy of decision tree is improved.
机译:知识发现是智能业务将数据转换为将增加业务收入的有用信息的重要工具。数据挖掘技术支持自动探索数据,并尝试对数据中的模式和趋势进行分类,以及从这些模式中推断规则。数据集的分类是采矿的重要功能,这是一个监督机器学习程序。当我们使用大型数据集时,分类器算法的可扩展性和效率成为关注的主要问题,并且需要更多数量的数据集解析。在本文中,我们介绍了一种可扩展的决策树算法,用于分类具有高处理速度的大型数据集,其只需要一个扫描数据集。它克服了雨林算法的缺点,它解决了可扩展性问题,并且需要在决策树构造的每个级别的数据集中传递。所提出的算法显着降低了IO成本,并且还需要一次排序的数值属性,这导致时间维度更好的性能。根据实验结果,我们的算法在雨林算法上获取更少的执行时间,并且还可以采用决策树的准确性的任何属性选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号