首页> 外文会议>Tools with Artificial Intelligence, 2009. ICTAI '09 >Approximate Repeating Pattern Mining with Gap Requirements
【24h】

Approximate Repeating Pattern Mining with Gap Requirements

机译:具有间隙要求的近似重复模式挖掘

获取原文

摘要

In this paper, we define a new research problem for mining approximate repeating patterns (ARP) with gap constraints, where the appearance of a pattern is subject to an approximate matching, which is very common in biological sciences. To solve the problem, we propose an ArpGap (Approximate repeating pattern mining with Gap constraints) algorithm with three major components for approximate repeating pattern mining: (1) a data-driven pattern generation approach to avoid generating unnecessary patterns; (2) a back-tracking pattern search process to discover approximate occurrences of a pattern under gap constraints; and (3) an Apriori-like deterministic pruning approach to progressively prune patterns and cease the search process if necessary. Experimental results on synthetic and real-world protein sequences assert that ArpGap is efficient in terms of memory consumption and computational cost.
机译:在本文中,我们定义了一个新的研究问题,用于挖掘具有间隙约束的近似重复模式(ARP),其中模式的外观受到近似匹配的影响,这在生物科学中非常普遍。为了解决该问题,我们提出了一种具有三个主要组成部分的ArpGap(具有Gap约束的近似重复模式挖掘)算法,用于近似重复模式挖掘:(1)一种数据驱动的模式生成方法,以避免生成不必要的模式; (2)回溯模式搜索过程,以发现在间隙约束下模式的近似出现; (3)一种类似Apriori的确定性修剪方法,用于逐渐修剪模式并在必要时停止搜索过程。关于合成和现实世界蛋白质序列的实验结果证明,ArpGap在内存消耗和计算成本方面非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号