首页> 外文会议>Bioinformatics research and applications >The Maximum Clique Enumeration Problem:Algorithms, Applications and Implementations
【24h】

The Maximum Clique Enumeration Problem:Algorithms, Applications and Implementations

机译:最大派生枚举问题:算法,应用和实现

获取原文
获取原文并翻译 | 示例

摘要

Algorithms are designed, analyzed and implemented for the maximum clique enumeration (MCE) problem, which asks that we identify all maximum cliques in a finite, simple graph. MCE is closely related to two other well-known and widely-studied problems: the maximum clique optimization problem, which asks us to determine the size of a largest clique, and the maximal clique enumeration problem, which asks that we compile a listing of all maximal cliques. Naturally, these three problems are Np-hard, given that they subsume the classic version of the Np-complete clique decision problem. MCE can be solved in principle with standard enumeration methods due to Bron, Kerbosch, Kose and others. Unfortunately, these techniques are ill-suited to graphs encountered in our applications. We must solve MCE on instances deeply seeded in data mining and computational biology, where high-throughput data capture often creates graphs of extreme size and density. MCE can also be solved in principle using more modern algorithms based in part on vertex cover and the theory of fixed-parameter tractability (FPT). While FPT is an improvement, these algorithms too can fail to scale sufficiently well as the sizes and densities of our datasets grow. An extensive testbed of benchmark MCE instances is devised, based on applications in transcriptomic data analysis. Empirical testing reveals crucial but latent features of such high-throughput biological data. In turn, it is shown that these features distinguish real data from random data intended to reproduce salient topological features. In particular, with real data there tends to be an unusually high degree of maximum clique overlap. Armed with this knowledge, novel decomposition strategies are tuned to the data and coupled with the best FPT MCE implementations. It is demonstrated that the resultant run times are frequently reduced by several orders of magnitude, and that instances once prohibitively time-consuming to solve are now often brought into the domain of realistic feasibility.
机译:针对最大集团枚举(MCE)问题设计,分析和实现算法,该问题要求我们在有限的简单图中识别所有最大集团。 MCE与另外两个众所周知且研究广泛的问题密切相关:最大派系优化问题,它要求我们确定最大派系的大小;最大派系枚举问题,它要求我们编制所有清单最大集团。当然,这三个问题都属于Np-难问题,因为它们包含了Np-完整集团决策问题的经典版本。 MCE原则上可以使用Bron,Kerbosch,Kose等人的标准枚举方法求解。不幸的是,这些技术不适用于我们的应用程序中遇到的图形。我们必须在深入挖掘数据挖掘和计算生物学的实例上解决MCE,在这些实例中,高吞吐量数据捕获通常会创建极端大小和密度的图形。原则上也可以使用更现代的算法(部分基于顶点覆盖和固定参数可处理性(FPT))来求解MCE。尽管FPT是一项改进,但随着我们数据集的大小和密度的增长,这些算法也可能无法充分扩展。基于转录组数据分析中的应用,设计了基准MCE实例的广泛测试平台。实证测试揭示了这种高通量生物学数据的关键但潜在的特征。反过来,表明这些特征将真实数据与旨在再现显着拓扑特征的随机数据区分开。特别是,对于真实数据,往往会出现异常高的最大集团重叠。有了这些知识,就可以将新颖的分解策略调整为数据,并结合最佳的FPT MCE实现。事实证明,结果运行时间通常减少了几个数量级,解决实例的时间过长而耗时,现在通常被纳入现实可行的范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号