首页> 中文期刊> 《西安文理学院学报(自然科学版)》 >基于 MapReduce 的分块压缩矩阵Apriori 的并行化研究

基于 MapReduce 的分块压缩矩阵Apriori 的并行化研究

         

摘要

In view of the problem of the classic Apriori algorithm need to scan the database re-peatedly and it is not suitable for large-scale data, in this paper, an improved Apriori algorithm was proposed, which used the relationship operation of the Boolean vector, and transformed the transaction database after scanning into a compression matrix. Under the MapReduce frame-work, the compression matrix was divided into blocks for distributed processing. Sub-com-pression matrix was used to do fast calculation for all candidate sets, and the frequent K sets had been generated from all of above, finally, the time complexity of Apriori algorithm was reduced.%针对经典的 Apriori 算法需要多次扫描数据库,不适合大规模数据这个问题,提出了一种改进的 Apriori 算法。该算法采用布尔向量关系运算思想,将事务数据库扫描后转化成压缩矩阵,在 MapRe-duce 框架下将压缩矩阵进行分块,每块分别被做并列式处理。利用分压缩矩阵快速计算所有的候选项集,从中产生频繁 K -项集,降低了 Apriori 算法的时间复杂度。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号