首页> 外文期刊>Advanced engineering informatics >Fast algorithms for mining high-utility itemsets with various discount strategies
【24h】

Fast algorithms for mining high-utility itemsets with various discount strategies

机译:利用各种折扣策略挖掘高实用性项目集的快速算法

获取原文
获取原文并翻译 | 示例

摘要

In recent years, mining high-utility itemsets (HUIs) has emerged as a key topic in data mining. It consists of discovering sets of items generating a high profit in a transactional database by considering both purchase quantities and unit profits of items. Many algorithms have been proposed for this task. However, most of them assume the unrealistic assumption that unit profits of items remain unchanged over time. But in real-life, the profit of an item or itemset varies as a function of cost prices, sales prices and sale strategies. Recently, a three-phase algorithm has been proposed to mine HUIs, while considering that each item may have different discount strategies. However, the complete set of HUIs cannot be retrieved based on the traditional TWU model with its defined discount strategies. Moreover, it suffers from the well-known drawbacks of Apriori-based algorithms such as maintaining a huge amount of candidates in memory and repeatedly performing time-consuming database scans. In this paper, a HUI-DTP algorithm for mining HUIs when considering discount strategies of items is introduced. The HUI-DTP is designed as a two-phase algorithm to mine the complete set of HUIs based on a novel downward closure property and a vertical TID-list structure. Furthermore, the HUI-DMiner is an algorithm relying on a compact data structure (Positive-and-Negative Utility-list, PNU-list) and properties of two new pruning strategies to efficiently discover HUIs without candidate generation, while considerably reducing the size of the search space. Moreover, a strategy named Estimated Utility Co-occurrence Strategy which stores the relationships between 2-itemsets is also applied in the improved HUI-DEMiner algorithm to speed up computation. An extensive experimental study carried on several real-life datasets shows that the proposed algorithms outperform the previous best algorithm in terms of runtime, memory consumption and scalability.
机译:近年来,挖掘高实用性项目集(HUI)已成为数据挖掘中的关键主题。它包括通过同时考虑商品的购买数量和单位利润来发现一组在交易数据库中产生高利润的商品。已经为此任务提出了许多算法。但是,它们中的大多数假设都是不现实的假设,即项目的单位利润会随时间保持不变。但是在现实生活中,一个或多个项目的利润随成本,销售价格和销售策略的变化而变化。最近,已经提出了一种三相算法来挖掘HUI,同时考虑到每个项目可能具有不同的折扣策略。但是,无法基于具有定义折扣策略的传统TWU模型检索完整的HUI。此外,它还遭受了基于Apriori的算法的众所周知的缺点,例如在内存中维护大量候选对象并重复执行耗时的数据库扫描。介绍了一种考虑物品折扣策略的HUI挖掘的HUI-DTP算法。 HUI-DTP被设计为一种两阶段算法,用于基于新颖的向下闭合特性和垂直TID列表结构来挖掘完整的HUI集。此外,HUI-DMiner是一种算法,它依赖于紧凑的数据结构(正负工具列表,PNU列表)和两种新的修剪策略的属性,可以有效地发现HUI,而无需生成候选对象,同时大大减小了HUI的大小。搜索空间。此外,在改进的HUI-DEMiner算法中还应用了一种存储2个项目集之间关系的名为估计效用共现策略的策略,以加快计算速度。在几个真实数据集上进行的广泛实验研究表明,在运行时间,内存消耗和可伸缩性方面,所提出的算法优于以前的最佳算法。

著录项

  • 来源
    《Advanced engineering informatics》 |2016年第2期|109-126|共18页
  • 作者单位

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, 518055, China;

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, 518055, China;

    School of Natural Sciences and Humanities, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, 528055, China;

    Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, 811, Taiwan,Department of Computer Science and Engineering, National Sun Yat-sen University, 804, Kaohsiung, Taiwan;

    Department of Computer Science, National Chiao Tung University, Hsinchu, 300, Taiwan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    High-utility itemsets; Discount strategies; Downward closure property; Pruning strategies; PNU-list;

    机译:高实用性项目集;折扣策略;向下封闭性;修剪策略;PNU列表;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号