首页> 中文期刊>西南交通大学学报 >基于分布式的频繁闭合模式挖掘算法

基于分布式的频繁闭合模式挖掘算法

     

摘要

为提高数据挖掘效率,提出了一种基于分布式的频繁闭合模式挖掘算法——PFCI-Miner.该算法采用任务分布的主从方式,其中主处理器通过发送提出的前缀路径表(PrePthx)将挖掘任务合理划分,而从处理器借助提出的存储树(Trac-tree)挖掘局部频繁闭合模式,最后由主处理器挖掘出全局频繁闭合模式.此外,采用星形拓扑结构,使数据通信只存在于主处理器与从处理器之间,而各从处理器之间无数据通信且不需要同步.在由3台PC机构成的分布式环境下,对合成与蘑菇数据集的实验表明,PFCI-Miner较DP-FP算法、AFCIM算法和DFCIM算法的执行效率分别平均提高了43.66%、42.17%、53.48%和51.86%、47.62%、62.78%.%In order to improve the mining efficiency, an algorithm, PFCI-Miner, based on distributed frequent closed patterns mining was proposed. This algorithm adopts a master-slave structure to implement task distribution. The master processor assigns a task efficiently by sending a proposed prefix path table (PrePthx) , and the slave processors mine local frequent closed patterns with the help of a proposed store tree ( Trac-tree). Finally the master processor mines the global frequent closed patterns. The algorithm uses star-like topology in order to make all data communications only between the master processor and the slave processors, there being no communication and no synchronization among all slave processors. Computer simulation on synthesis and mushroom data sets under the distribution of 3 PC computers shows that compared with the DP-FP algorithm, the AFCIM (adaptive frequent closed itemsets mining model) algorithm and the DFCIM (distributed frequent closed itemsets mining) algorithm, the PFCI-Miner algorithm has, on average, 43. 66% , 42. 17% , 53. 48% and 51.86% , 47.62% , 62.78% improvements in the efficiency respectively.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号