首页> 中文期刊> 《计算机学报》 >基于频繁概念直乘分布的全局闭频繁项集挖掘算法

基于频繁概念直乘分布的全局闭频繁项集挖掘算法

         

摘要

With increasing distributed computing environment applied extensively, traditional center data mining algorithms which are based on concept lattice could not take full advantage of distributed computing resources to improve the time efficiency of constructing concept lattice. In consequence, the performance of mining algorithms could be affected. In this paper, we firstly further analyze the deep underlying parallel features of apposition assembly of Iceberg concept lat-tice. Secondly, we consider the sets which are consisted of the frequent concept direct produce and its lower cover as minimal computing units. And then those units can be scattered, handled distributively, and finally integrated into a global Iceberg concept lattice. The procedure of dis-tributed assembly of Iceberg concept lattice is theoretically proved correct. Based on above works, a new algorithm is proposed to mine global closed frequent itemsets in heterogeneous dis-tributed computing environment. This algorithm exploits the good quality of semi-lattice and ap-position assembly construction, both of which are induced by Iceberg concept lattice. Therefore the algorithm has the ability to make the most of advantage of the computing sources in the dis-tributed environment. It shows excellent efficiency of global data mining under both dense and sparse heterogeneous distributed data sets in experiments.%基于概念格的集中式数据挖掘算法,不能充分地利用分布式计算资源来改善概念格构造效率,从而影响了挖掘算法的性能.文中进一步分析了Iceberg概念格并置集成的内在并行特性;以频繁概念直乘及其下覆盖为最小粒度,对Iceberg概念格并置集成过程进行分解和分布式计算;在对其正确性理论证明的基础上,提出了一个新颖的异构分布式环境下闭频繁项集全局挖掘算法.此算法利用Iceberg概念格的半格以及可并置集成特性,充分发挥了分布式环境下计算资源的优势.实验证明,在稠密数据集和稀疏数据集上,该挖掘算法都表现出较好的性能.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号