首页> 外文期刊>Future generation computer systems >Mining high occupancy itemsets
【24h】

Mining high occupancy itemsets

机译:挖掘高占用率项目集

获取原文
获取原文并翻译 | 示例
       

摘要

Frequent itemset mining has been extensively studied in data mining for over the last two decades because of its numerous applications. However, the classic support-based mining framework used by most previous studies is not suitable for some real-world applications, such as the travel landscapes recommendation, where occupancy besides support also plays a key role in evaluating the interesting-ness of an itemset. In this paper, we propose a new kind of tasks based on occupancy, namely high occupancy mining, by introducing occupancy into the support-based mining framework. An efficient algorithm, HEP (abbreviation for High Efficient algorithm for mining high occupancy itemsets), is developed to discover all high occupancy itemsets. HEP use a structure, named occupancy-list, to store the occupancy information about an itemset and employs an iterative level-wise approach to mine high occupancy itemset via a pruning strategy based on upper bound of occupancy. Substantial experiments on both synthetic and real datasets show that HEP is efficient for mining high occupancy itemsets and is at least one order of magnitude faster than the baseline algorithm. (C) 2019 Elsevier B.V. All rights reserved.
机译:在过去的二十年中,频繁项集挖掘已经在数据挖掘中进行了广泛的研究,这是因为其用途广泛。但是,大多数以前的研究使用的基于支持的经典挖掘框架不适用于某些实际应用,例如旅行风景推荐,在这种情况下,除了支持外,占用率在评估项目集的趣味性方面也起着关键作用。在本文中,我们通过将占用率引入基于支持的挖掘框架中,提出了一种基于占用率的新任务,即高占用率挖掘。开发了一种有效的算法HEP(用于挖掘高占用率项目集的高效算法的缩写),以发现所有高占用率项目集。 HEP使用一种名为occupancy-list的结构来存储有关项目集的占用信息,并采用迭代级别方法通过基于占用上限的修剪策略来挖掘高占用率项目集。在合成数据集和真实数据集上的大量实验表明,HEP可有效地挖掘高占用率的项目集,并且比基线算法快至少一个数量级。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号