首页> 外文会议>International Conference on Materials Science and Information Technology >A New Data Classification Algorithm for Data-Intensive Computing Environments
【24h】

A New Data Classification Algorithm for Data-Intensive Computing Environments

机译:一种新的数据密集型计算环境数据分类算法

获取原文

摘要

In order to solve the problem of how to improve the scalability of data processing capabilities and the data availability which encountered by data mining techniques for Dataintensive computing, a new method of tree learning is presented in this paper. By introducing the MapReduce, the tree learning method based on SPRINT can obtain a well scalability when address large datasets. Moreover, we define the process of split point as a series of distributed computations, which is implemented with the MapReduce model respectively. And a new data structure called class distribution table is introduced to assist the calculation of histogram. Experiments and results analysis shows that the algorithm has strong processing capabilities of data mining for dataintensive computing environments.
机译:为了解决如何提高数据处理能力的可扩展性的问题和数据挖掘技术的数据挖掘技术的数据可用性,本文提出了一种新的树学习方法。通过引入MapReduce,基于Sprint的树学习方法可以在地址大型数据集时获得井可伸缩性。此外,我们将分割点的过程定义为一系列分布式计算,其分别使用MapReduce模型实现。引入了一种新的数据结构,称为类分配表以帮助计算直方图。实验和结果分析表明,该算法对新增计算环境进行了数据挖掘的强大处理能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号