首页> 外文会议>Advances in Knowledge Discovery and Data Mining >DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology
【24h】

DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology

机译:DELISP:通过定界模式增长技术高效发现广义顺序模式

获取原文

摘要

An active research in data mining is the discovery of sequential patterns, which finds all frequent sub-sequences in a sequence database. Most of the studies specify no time constraints such as maximum/minimum gaps between adjacent elements of a pattern in the mining so that the resultant patterns may be uninteresting. In addition, a data sequence containing a pattern is rigidly defined as only when each element of the pattern is contained in a distinct element of the sequence. This limitation might lose useful patterns for some applications because sometimes items of an element might be spread across adjoining elements within a specified time period or time window. Therefore, we propose a pattern-growth approach for mining the generalized sequential patterns. Our approach features in reducing the size of sub-databases by bounded and windowed projection techniques. Bounded projections keep only time-gap valid sub-sequences and windowed projections save non-redundant sub-sequences satisfying the sliding time window constraint. Furthermore, the delimited growth technique directly generates constraint-satisfactory patterns and speeds up the growing process. The empirical evaluations show that the proposed approach has good linear scalability and outperforms the well-known GSP algorithm in the discovery of generalized sequential patterns.
机译:数据挖掘中的一项积极研究是发现序列模式,该模式可在序列数据库中找到所有常见的子序列。大多数研究没有规定任何时间限制,例如采矿中某个模式的相邻元素之间的最大/最小间隙,因此生成的模式可能不会引起人们的兴趣。另外,仅当模式的每个元素包含在序列的不同元素中时,才严格定义包含模式的数据序列。对于某些应用程序,此限制可能会失去有用的模式,因为有时元素的项可能会在指定的时间段或时间窗口内分布在相邻元素之间。因此,我们提出了一种模式增长方法来挖掘广义顺序模式。我们的方法的特点是通过有界和窗口化投影技术来减少子数据库的大小。有界投影仅保留有时间间隔的有效子序列,而带窗投影则保留满足滑动时间窗约束的非冗余子序列。此外,定界增长技术直接生成约束令人满意的模式并加快了增长过程。实验评估表明,该方法具有良好的线性可扩展性,并且在发现广义顺序模式方面胜过了著名的GSP算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号