...
首页> 外文期刊>Wireless communications & mobile computing >An Efficient Algorithm for Extracting High-Utility Hierarchical Sequential Patterns
【24h】

An Efficient Algorithm for Extracting High-Utility Hierarchical Sequential Patterns

机译:提取高实用程序层次顺序模式的高效算法

获取原文
           

摘要

High-utility sequential pattern mining (HUSPM) is an emerging topic in data mining, where utility is used to measure the importance or weight of a sequence. However, the underlying informative knowledge of hierarchical relation between different items is ignored in HUSPM, which makes HUSPM unable to extract more interesting patterns. In this paper, we incorporate the hierarchical relation of items into HUSPM and propose a two-phase algorithm MHUH, the first algorithm for high-utility hierarchical sequential pattern mining (HUHSPM). In the first phase named Extension, we use the existing algorithm FHUSpan which we proposed earlier to efficiently mine the general high-utility sequences (g-sequences); in the second phase named Replacement, we mine the special high-utility sequences with the hierarchical relation (s-sequences) as high-utility hierarchical sequential patterns from g-sequences. For further improvements of efficiency, MHUH takes several strategies such as Reduction, FGS, and PBS and a novel upper bounder TSWU, which will be able to greatly reduce the search space. Substantial experiments were conducted on both real and synthetic datasets to assess the performance of the two-phase algorithm MHUH in terms of runtime, number of patterns, and scalability. Conclusion can be drawn from the experiment that MHUH extracts more interesting patterns with underlying informative knowledge efficiently in HUHSPM.
机译:高实用程序顺序模式挖掘(HUSPM)是数据挖掘中的新兴主题,其中实用程序用于衡量序列的重要性或重量。但是,在HUSPM中忽略不同物品之间的分层关系的潜在信息知识,从HUSPM忽略了HUSPM无法提取更有趣的模式。在本文中,我们将物品的分层关系纳入HUSPM并提出了一种两相算法MHUH,这是高实用层次顺序模式挖掘(HUHSPM)的第一算法。在第一阶段命名扩展名称中,我们使用前面提出的现有算法FHUSPAN,以有效地挖掘一般的高实用序列(G序列);在第二阶段命名替代,我们将特殊的高实用程序序列与分层关系(S-序列)作为来自G序列的高实用程序分层顺序模式。为了进一步提高效率,MHUH采用若干策略,如还原,FGS和PBS和新的上边界TSWU,这将能够大大减少搜索空间。在实际和合成数据集中进行了大量实验,以评估运行时,模式数量和可扩展性的两相算法MHUH的性能。结论可以从实验中汲取,MHUH在HUHSPM中有效地提取更有趣的模式,以潜在的信息知识有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号