首页> 外文期刊>ACM transactions on database systems >A Unified Framework for Frequent Sequence Mining with Subsequence Constraints
【24h】

A Unified Framework for Frequent Sequence Mining with Subsequence Constraints

机译:频繁序列挖掘与子序列约束的统一框架

获取原文
获取原文并翻译 | 示例

摘要

Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this article, we show that many subsequence constraints-including and beyond those considered in the literature center dot -can be unified in a single framework. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners. In more detail, we propose a set of simple and intuitive "pattern expressions" to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. Our algorithms translate pattern expressions to succinct finite-state transducers, which we use as computational model, and simulate these transducers in a way suitable for frequent sequence mining. Our experimental study on real-world datasets indicates that our algorithms-although more general-are efficient and, when used for sequence mining with prior constraints studied in literature, competitive to (and in some cases superior to) state-of-the-art specialized methods.
机译:频繁的序列挖掘方法经常利用约束来控制哪个后续应开采的子次序。在文献中研究了各种这样的子序列约束,包括长度,差距,跨度,常规表达和层次结构。在本文中,我们表明许多子序列限制 - 包括在文献中心点 - 卡中考虑的那些,在一个框架中统一。统一的治疗允许研究人员共同研究许多类型的子序列约束(而不是单独地),并有助于提高用于从业者的模式采矿系统的可用性。更详细地,我们提出了一组简单和直观的“模式表达式”,以描述随后的约束和探索算法,以便在这种一般约束下有效地挖掘频繁的子序列。我们的算法将模式表达转换为简洁的有限状态传感器,我们用作计算模型,并以适合频繁序列采矿的方式模拟这些换能器。我们对现实世界数据集的实验研究表明我们的算法 - 虽然更通用 - 高效,并且当用于与文学中研究的先前约束进行序列挖掘时,竞争(在某些情况下优于)最先进的专业方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号