首页> 外文期刊>Advanced engineering informatics >A two-phase approach to mine short-period high-utility itemsets in transactional databases
【24h】

A two-phase approach to mine short-period high-utility itemsets in transactional databases

机译:一种分两阶段的方法来挖掘事务数据库中的短期高实用性项目集

获取原文
获取原文并翻译 | 示例

摘要

The discovery of high-utility itemsets (HUIs) in transactional databases has attracted much interest from researchers in recent years since it can uncover hidden information that is useful for decision making, and it is widely used in many domains. Nonetheless, traditional methods for high-utility itemset mining (HUIM) utilize the utility measure as sole criterion to determine which item/sets should be presented to the user. These methods ignore the timestamps of transactions and do not consider the period constraint. Hence, these algorithms often finds HUIs that are profitable but that seldom occur in transactions. In this paper, we address this limitation of previous methods by pushing the period constraint in the HUI mining process. A new framework called short-period high-utility itemset mining (SPHUIM) is designed to identify patterns in a transactional database that appear regularly, are profitable, and also yield a high utility under the period constraint. The aim of discovering short-period high-utility itemsets (SPHUI) is hence to identify patterns that are interesting both in terms of period and utility. The paper proposes a baseline two-phase short-period high-utility itemset (SPHUI_T _P) mining algorithm to mine SPHUIs in a level-wise manner. Then, to reduce the search space of the SPHUI_(TP) algorithm and speed up the discovery of SPHUIs, two pruning strategies are developed and integrated in the baseline algorithm. The resulting algorithms are denoted as SPHUI_(MT) and SPHUI_(TID), respectively. Substantial experiments both on real-life and synthetic datasets show that the three proposed algorithms can efficiently and effectively discover the complete set of SPHUIs, and that considering the short-period constraint and the utility measure can greatly reduce the number of patterns found.
机译:近年来,在事务数据库中发现高实用性项目集(HUI)引起了研究人员的极大兴趣,因为它可以发现对决策有用的隐藏信息,并且已在许多领域中得到广泛使用。尽管如此,用于高实用性项目集挖掘(HUIM)的传统方法仍将实用性度量作为唯一标准来确定应向用户呈现哪些项目/集合。这些方法忽略事务的时间戳,并且不考虑周期约束。因此,这些算法通常会找到可盈利但很少在交易中发生的HUI。在本文中,我们通过在HUI挖掘过程中推动周期约束来解决以前方法的局限性。设计了一种称为短期高实用性项目集挖掘(SPHUIM)的新框架,以识别交易数据库中定期出现,可获利并在期限约束下产生高实用性的模式。因此,发现短时期的高实用性项目集(SPHUI)的目的是识别在周期和实用性方面都令人感兴趣的模式。提出了一种基线两阶段短期高实用性项目集(SPHUI_T _P)挖掘算法,以逐级挖掘SPHUI。然后,为了减少SPHUI_(TP)算法的搜索空间并加快SPHUI的发现,开发了两种修剪策略并将其整合到基线算法中。所得算法分别表示为SPHUI_(MT)和SPHUI_(TID)。在真实数据集和合成数据集上的大量实验表明,所提出的三种算法可以有效地发现SPHUI的完整集合,并且考虑到短期约束和效用措施,可以大大减少发现的模式数量。

著录项

  • 来源
    《Advanced engineering informatics》 |2017年第8期|29-43|共15页
  • 作者单位

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China;

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China;

    School of Natural Sciences and Humanities, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China;

    Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan,Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan;

    School of Agricultural, Computational and Environmental Sciences, University of Southern Queensland, Australia;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Data mining; High-utility itemsets; Periodic high-utility itemsets; SPHUIs; Two-phase;

    机译:数据挖掘;高实用性项目集;定期的高实用性项目集;SPHUI;两相;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号