【24h】

Mining top-K Frequent and Flexible Pattern from sequences

机译:挖掘Top-K常意和灵活的序列模式

获取原文

摘要

Pattern Mining is a popular issue in biological sequence analysis. With the introduction of wildcard gaps, more interesting patterns can be mined. In this paper, we propose a new definition related to pattern frequency, under which the Apriori property holds. We define a pattern mining problem called Ming top-K Frequent Patterns (MFP), where gaps are mined instead of specified. Compared with existing problems, MFP does not require any domain knowledge of the user. However, theoretical analysis and experimental results show that MFP favors inflexible patterns. We then define another problem where the flexibility threshold of each gap is specified by the user. The problem is called Mining top-K Frequent and Flexible Patterns (MF2P). We develop algorithm with polynomial complexities for both problems. Patterns can grow from both sides. Some interesting biological patterns mined by our algorithms are discussed.
机译:模式挖掘是生物序列分析中的一个受欢迎的问题。随着通配符差距的引入,可以开采更有趣的模式。在本文中,我们提出了一个与模式频率相关的新定义,在此期间的Apriori属性持有。我们定义一个名为Ming Top-K频繁模式(MFP)的模式挖掘问题,其中开采了间隙而不是指定。与现有问题相比,MFP不需要用户的任何域知识。然而,理论分析和实验结果表明,MFP有利于不灵活的模式。然后,我们定义了另一个问题,其中用户指定了每个间隙的灵活性阈值。问题称为挖掘Top-K频繁和灵活的模式(MF 2 P)。我们用多项式复杂性为两个问题开发算法。图案可以从两侧生长。讨论了我们算法所开采的一些有趣的生物学模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号