首页> 外文期刊>Quality Control, Transactions >D-GENE: Deferring the GENEration of Power Sets for Discovering Frequent Itemsets in Sparse Big Data
【24h】

D-GENE: Deferring the GENEration of Power Sets for Discovering Frequent Itemsets in Sparse Big Data

机译:D-Gene:推迟在稀疏大数据中发现频繁项目集的电源集的产生

获取原文
获取原文并翻译 | 示例
       

摘要

Sparseness is the distinctive aspect of big data generated by numerous applications at present. Furthermore, several similar records exist in real-world sparse datasets. Based on Iterative Trimmed Transaction Lattice (ITTL), the recently proposed TRICE algorithm learns frequent itemsets efficiently from sparse datasets. TRICE stores alike transactions once, and eliminates the infrequent part of each distinct transaction afterward. However, removing the infrequent part of two or more distinct transactions may result in similar trimmed transactions. TRICE repeatedly generates ITTLs of similar trimmed transactions that induce redundant computations and eventually, affects the runtime efficiency. This paper presents D-GENE, a technique that optimizes TRICE by introducing a deferred ITTL generation mechanism. D-GENE suspends the process of ITTL generation till the completion of transaction pruning phase. The deferral strategy enables D-GENE to generate ITTLs of similar trimmed transactions once. Experimental results show that by avoiding the redundant computations, D-GENE gets better runtime efficiency. D-GENE beats TRICE, FP-growth, and optimized versions of SaM and RElim algorithms comprehensively, especially when the difference between distinct transactions and distinct trimmed transactions is high.
机译:稀疏性是目前许多应用产生的大数据的独特方面。此外,现实世界稀疏数据集中存在几种类似的记录。基于迭代修剪的交易格(ITTL),最近提出的TRICE算法从稀疏数据集中有效地了解频繁的项目集。 TRICRE在一次存储一次,并以后消除每个不同交易的不常见部分。但是,删除两个或多个不同事务的不常见部分可能导致类似的修剪事务。 TRICE反复生成ITTL的ITTL,其具有诱导冗余计算的类似修剪的事务,最终影响运行时效率。本文呈D-Gene,一种通过引入延迟ITTL生成机制来优化TRICE的技术。 D-Gene暂停ITTL生成过程,直到完成交易修剪阶段。延迟策略使D-Gene能够生成一次类似修剪交易的ITTL。实验结果表明,通过避免冗余计算,D-基因获得更好的运行时间效率。 D-基因全面地击败了TRICH,FP-生长和优化的SAM和RELIM算法的优化版本,特别是当不同事务和不同的修剪交易之间的差异很高时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号