首页> 中文期刊> 《计算机工程与设计》 >Hadoop环境下基于并行熵的FIUT算法挖掘

Hadoop环境下基于并行熵的FIUT算法挖掘

         

摘要

Focusing on the inefficient problem of traditional algorithms for mining frequent itemsets, aparallel algorithm named Balanced_MapReduce_FIUT (BMR-FIUT) based on Hadoop platform was proposed.By introducing frequent items ultrametric tree (FIU-Tree) structure, frequent itemsets were mined, effectively avoiding the defects of the traditional algorithm.The process of decomposition was improved with FIUT algorithm to adapt to its parallel computing under the framework of MapReduce, achieving the purpose of parallelization.The parallel entropy was used as the load balance measurement in cluster system, so that system could in all reasonable to distribute data as much as possible between every nodes.Experimental results show that BMR-FIUT algorithm can effectively reduce the problem about load inclination of any node in the process of parallelization, it is superior to the existing PFP-Growth algorithm and it has better performance on mining volume big data.%针对传统频繁项集挖掘算法效率低下的问题, 提出基于Hadoop平台的并行BMR-FIUT算法.通过引入FIU-Tree (frequent items ultrametric tree) 结构挖掘频繁项集, 避免传统算法的缺陷;改进FIUT算法的分解过程, 使之适应于Map-Reduce框架下的并行计算, 达到并行化的目的;利用并行熵作为集群系统的负载均衡度量, 使系统尽可能在各节点间合理分发数据以平衡负载.实验结果表明, BMR-FIUT算法能够有效减少并行化过程中节点负载倾斜的问题, 较现有的PFP-Growth算法具有更好的性能, 适用于海量数据挖掘.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号