首页> 外文会议>International conference on web-age information management >Mining Top-k Distinguishing Sequential Patterns with Flexible Gap Constraints
【24h】

Mining Top-k Distinguishing Sequential Patterns with Flexible Gap Constraints

机译:挖掘具有柔性间隙约束的Top-k区分序列模式

获取原文

摘要

Distinguishing sequential pattern (DSP) mining has been widely employed in many applications, such as building classifiers and comparing/analyzing protein families. However, in previous studies on DSP mining, the gap constraints are very rigid - they are identical for all discovered patterns and at all positions in the discovered patterns, in addition to being predetermined. This paper considers a more flexible way to handle gap constraint, allowing the gap constraints between different pairs of adjacent elements in a pattern to be different and allowing different patterns to use different gap constraints. The associated DSPs will be called DSPs with flexible gap constraints. After discussing the importance of specifying/determining gap constraints flexibly in DSP mining, we present GepDSP, a heuristic mining method based on Gene Expression Programming, for mining DSPs with flexible gap constraints. Our empirical study on real-world data sets demonstrates that GepDSP is effective and efficient, and DSPs with flexible gap constraints are more effective in capturing discriminating sequential patterns.
机译:区分顺序模式(DSP)挖掘已广泛用于许多应用中,例如构建分类器和比较/分析蛋白质家族。但是,在以前的DSP挖掘研究中,间隙约束非常严格-除了预先确定之外,它们对于所有发现的模式以及在发现的模式中的所有位置都是相同的。本文考虑了一种更灵活的方式来处理间隙约束,即允许图案中不同对相邻元素之间的间隙约束不同,并允许不同的图案使用不同的间隙约束。关联的DSP将被称为具有灵活间隙限制的DSP。在讨论了在DSP挖掘中灵活指定/确定间隙约束的重要性之后,我们提出了GepDSP,这是一种基于基因表达编程的启发式挖掘方法,用于挖掘具有灵活间隙约束的DSP。我们对现实世界数据集的经验研究表明,GepDSP是有效且高效的,而具有灵活的间隙约束的DSP在捕获有区别的顺序模式方面更为有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号