首页> 外文期刊>Future generation computer systems >A novel process-based association rule approach through maximal frequent itemsets for big data processing
【24h】

A novel process-based association rule approach through maximal frequent itemsets for big data processing

机译:通过最大频繁项集进行大数据处理的基于过程的新颖关联规则方法

获取原文
获取原文并翻译 | 示例

摘要

AbstractThe maximal frequent itemsets issue in big data processing has become a hot research topic. Most of the previous work on big data processing directly analyzes the data through the existing approaches, which would cause problems of redundant computation, high time complexity, and large storage space. To solve the problems, this paper proposes a Heuristic MapReduce-based Association rule approach through Maximal frequent itemsets mining, HMAM. The main idea is: At first, by directly operating on the transaction database, we allocate transactions to different processing nodes and group all transactions according to dimension. Then, we screen the most frequent transactions from each transaction set using the Bitmap-Sort and obtain best-transaction-set through aggregating all the transaction-elects of each transaction set. The current candidate maximal frequent itemsets can be acquired by removing sub-transactions in terms of the inclusion relations of the items in best-transaction-set. At the same time, each subset of sub-transactions in the candidate maximal frequent itemsets is discarded from all transaction sets. Then the final candidate maximal frequent itemsets can be obtained by iteration until each transaction set is empty. Finally, we achieve the acquisition of maximal frequent itemsets by employing the minimum support threshold. The experimental results demonstrate that compared with the existing approaches, HMAM significantly avoids producing a large number of candidate itmesets resulting from join operation, accelerates the speed of mining the maximal frequent itemsets, and improves the utilization rate of resources simultaneously.HighlightsAllocate transactions to different nodes and group them in terms of dimension.Screen the most frequent transactions from each transaction set using Bitmap-Sort.Obtain maximal frequent itemsets by employing the minimum support threshold.
机译: 摘要 在大数据处理中出现的最大频繁项集已成为研究的热点。以前有关大数据处理的大多数工作都是通过现有方法直接分析数据,这会引起冗余计算,时间复杂度高和存储空间大的问题。为了解决这些问题,本文提出了一种通过最大频繁项集挖掘HMAM的基于启发式MapReduce的关联规则方法。主要思想是:首先,通过直接在交易数据库上进行操作,我们将交易分配到不同的处理节点,并根据维度对所有交易进行分组。然后,我们使用Bitmap-Sort筛选每个交易集中最频繁的交易,并通过汇总每个交易集的所有交易对象来获得best-transaction-set。根据最佳交易集中项目的包含关系,可以通过删除子交易来获取当前候选最大频繁项目集。同时,将候选最大频繁项目集中的子交易的每个子集从所有交易集中丢弃。然后,可以通过迭代获得最终候选的最大频繁项集,直到每个交易集为空。最后,我们通过使用最小支持阈值来获得最大频繁项集。实验结果表明,与现有方法相比,HMAM避免了由于联接操作而产生大量候选itemeset,加快了挖掘最大频繁项集的速度,同时提高了资源利用率。 突出显示 将事务分配给不同的节点,并按维度对它们进行分组。 屏幕使用Bi的每个交易集中最频繁的交易tmap-Sort。 通过使用最小支持阈值来获取最大频繁项集。

著录项

  • 来源
    《Future generation computer systems》 |2018年第4期|414-424|共11页
  • 作者单位

    College of Computer Science and Technology, Jilin University;

    College of Computer Science and Technology, Jilin University;

    College of Computer Science and Technology, Jilin University,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University;

    College of Computer Science and Technology, Jilin University,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University;

    Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University;

    School of Computer Science and Engineering, ChangChun University of Technology,High-Assurance Software Laboratory, INESC TEC & University of Minho;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Maximal frequent itemsets; Big data; MapReduce; Frequent transactions; Best-transaction-set;

    机译:最大频繁项集;大数据;MapReduce;频繁交易;最佳交易集;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号