首页> 外文期刊>Knowledge-Based Systems >Mining high utility itemsets using extended chain structure and utility machine
【24h】

Mining high utility itemsets using extended chain structure and utility machine

机译:采用扩展链结构和实用机采矿高效项目集

获取原文
获取原文并翻译 | 示例

摘要

High utility itemsets are sets of items that have a high utility (e.g. a high profit or a high importance) in a transaction database. Discovering high utility itemsets has many important applications in real-life such as market basket analysis. Nonetheless, mining these patterns is a time-consuming process due to the huge search space and the high cost of utility computation. Most of previous work is devoted to search space pruning but pay little attention to utility computation. Factually, not only search space pruning but also high utility itemset identification have to resort to the computation of various utilities. This paper proposes a novel algorithm named REX (Rapid itEmset eXtraction), which extends the classic d(2)HUP algorithm with an improved structure, a k-item utility machine, and an efficient switch strategy. The structure can significantly reduce the time complexity of utility computation compared with the original structure used in d(2)HUP. The machine can quickly merge identical transactions and applies an efficient procedure for computing the utilities of extensions of a given itemset. The strategy derived from trial and error drastically gives rise to performance improvement on some databases and is also competitive with the switch strategy used in d(2)HUP on other databases. Experimental results show that REX achieves a speedup of from fifty percent to three orders of magnitude over d(2)HUP even though they use identical pruning techniques and that REX considerably outperforms state-of-the-art algorithms on real-life and synthetic databases. (C) 2020 Elsevier B.V. All rights reserved.
机译:高实用程序项集是在事务数据库中具有高实用程序(例如高利润或高度的产品)的项目。发现高实用项目集在现实生活中有许多重要应用,如市场篮子分析。尽管如此,由于庞大的搜索空间和高效计算的高成本,挖掘这些模式是耗时的过程。以前的大多数工作都致力于搜索空间修剪,但几乎没有注意效用计算。事实上,不仅搜索空间修剪,而且高实用程序项目集识别必须诉诸各种公用事业的计算。本文提出了一种名为REX(快速项目集提取)的新型算法,其延伸了具有改进的结构,K-ITEM实用机和高效的开关策略的经典D(2)HUP算法。与D(2)HUP中使用的原始结构相比,该结构可以显着降低公用设施计算的时间复杂性。该机可以快速合并相同的事务,并应用一个有效的过程来计算给定项目集的扩展实用程序。源自试验和错误的策略大大促使某些数据库的性能改进,也与其他数据库中的D(2)HUP中使用的交换机策略竞争。实验结果表明,即使它们使用相同的修剪技术,rex在现实生活和合成数据库中使用相同的修剪技术和rex,雷克斯雷克斯达到了50%到三个数量级的加速度为5%到三个数量级到三个数量级(2)HUP,并且该REX在现实生活和合成数据库上表现出最优异的算法。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号