首页> 外文期刊>Knowledge-Based Systems >Efficient mining of high-utility itemsets using multiple minimum utility thresholds
【24h】

Efficient mining of high-utility itemsets using multiple minimum utility thresholds

机译:使用多个最小效用阈值有效挖掘高效项集

获取原文
获取原文并翻译 | 示例

摘要

In the field of data mining, the topic of high-utility itemset mining (HUIM) has recently gained a lot of attention from researchers as it takes many factors into account that are useful for decision-making by retail managers. In the past, many algorithms have been presented for HUIM but most of them suffer from the limitation of using a single minimum utility threshold to identify high-utility itemsets (HUIs). For real-life applications, finding itemsets using a single threshold is inadequate and unfair since each item is different. Hence, the diversity or importance of each item should be considered. This paper proposes a solution to this issue by defining the novel task of HUIM with multiple minimum utility thresholds (named as HUIM-MMU). This task lets users specify a different minimum utility threshold for each item to identify more useful and specific HUls, which would generate more profits when compared to HUIs discovered based on a single minimum utility threshold. The HUI-MMU algorithm is designed to mine Hills in a level-wise manner. The sorted downward closure (SDC) property and the least minimum utility (LMU) concept are developed to avoid a combinatorial explosion for identifying HUIs and to ensure the completeness and correctness of HUI-MMU for discovering HUls. Meanwhile, two improved algorithms, namely HUI-MMUTID and HUI-MMUTE, are presented based on the TID-index and EUCP strategies. Those strategies can be used to speed up the mining performance to discover HUls. Substantial experiments on both real-life and synthetic datasets show that the designed algorithms can efficiently and effectively discover the complete set of HUIs in databases by considering multiple minimum utility thresholds. (C) 2016 Elsevier B.V. All rights reserved.
机译:在数据挖掘领域,高实用项集挖掘(HUIM)主题最近引起了研究人员的广泛关注,因为它考虑了许多对零售经理的决策有用的因素。过去,已经针对HUIM提出了许多算法,但是大多数算法都受限于使用单个最小效用阈值来识别高效项集(HUI)的限制。对于现实生活中的应用程序,使用单个阈值查找项目集是不充分且不公平的,因为每个项目都是不同的。因此,应考虑每个项目的多样性或重要性。本文通过定义具有多个最小效用阈值(称为HUIM-MMU)的HUIM的新颖任务,为该问题提出了解决方案。通过此任务,用户可以为每个项目指定不同的最小效用阈值,以识别更多有用和特定的HUl,与基于单个最小效用阈值发现的HUI相比,这将产生更多的利润。 HUI-MMU算法旨在按级别方式挖掘Hills。开发了排序的向下闭合(SDC)属性和最小最小效用(LMU)概念,以避免识别HUI的组合爆炸,并确保用于发现HUls的HUI-MMU的完整性和正确性。同时,基于TID索引和EUCP策略,提出了两种改进算法,即HUI-MMUTID和HUI-MMUTE。这些策略可用于加快挖掘性能以发现HUls。在现实和合成数据集上的大量实验表明,通过考虑多个最小效用阈值,设计的算法可以有效地发现数据库中完整的HUI集。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号