首页> 外文会议>International Conference on Cloud Computing and Big Data Analytics >Research on Parallelization of Frequent Itemsets Mining Algorithm
【24h】

Research on Parallelization of Frequent Itemsets Mining Algorithm

机译:频繁项目集采矿算法的平行化研究

获取原文

摘要

FP-growth is a depth-first mining algorithm based on recursion and pattern growth. However, recursive mode can easily bring huge cost of time and space. Therefore, this paper proposes an improved non-recursive serial algorithm NRFP-growth and a parallel algorithm GPFP-growth. The NRFP-growth algorithm introduces the data structure of FP-array to store data sets, and uses the structure of ItemPoss-map to mine frequent itemsets. The GPFP-growth algorithm is based on NRFP-growth, and uses GPU to accelerate the process of mining frequent itemsets. In order to test the performance of the improved algorithm, this paper selects four data sets with different characteristics, and takes the classical serial algorithm as the benchmark to test the time and space performance of the serial improved algorithm, as well as the speedup ratio performance and scalability of the parallel algorithm.
机译:FP-生长是一种基于递归和模式生长的深度挖掘算法。 然而,递归模式很容易带来大量的时间和空间。 因此,本文提出了一种改进的非递归串行算法NRFP-生长和并行算法GPFP生长。 NRFP-生长算法介绍了存储数据集的FP-array的数据结构,并使用ItemPoss-Map的结构到频繁项目集。 GPFP-生长算法基于NRFP - 增长,并使用GPU加速开采频繁项目集的过程。 为了测试改进算法的性能,本文选择了具有不同特性的四个数据集,并将经典串行算法作为基准测试串行改进算法的时间和空间性能,以及加速比性能 和并行算法的可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号