首页> 外文会议>2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences >Invited: Fast and theoretically strong algorithms for kinship discovery
【24h】

Invited: Fast and theoretically strong algorithms for kinship discovery

机译:邀请:用于亲缘关系发现的快速且理论上强的算法

获取原文

摘要

In kinship inference, genealogical relationships among organisms, typically in naturally-occurring populations, based on genetic marker information are identified. This task is crucial to conservation of endangered species and to understand the diversity of populations. Some of the simplest problems in this domain are sib group and half-sibgroup discover. Natural objectives in this domain are statistical ones (such as maximum likelihood) and combinatorial one (such as parsimony). Unfortunately, even with error-free data, the simplest combinatorial objective, minimizing the number of matings, is NP-hard to approximate; the statistical objectives are even more challenging. Here, a simple combinatorial approach for the problem is shown. By enumerating triplets of population members that could be siblings and that could not be siblings, putative sibgroups are greedily constructed, merging them until no further mergings can occur. The simple algorithm performs comparably to or better than integer programming methods for the problem, in a tiny fraction of the runtime. Moreover, with high probability, these methods find the correct sibgroups, under a straightforward and standard probabilistic model of inheritance and mating. Hence, the NP-hardness of the original problem is ameliorated in "typical" instances of the problem. This phenomenon is common to a large variety of bioinformatics problems, so a discussion of how to respond to this observation is presented.
机译:在亲缘关系推断中,通常会基于遗传标记信息识别生物体之间的家谱关系,通常是自然发生的种群。这项任务对于保护濒危物种和了解种群多样性至关重要。该域中最简单的问题是同胞组和半同胞组发现。这个领域的自然目标是统计目标(例如最大似然)和组合目标(例如简约)。不幸的是,即使有无差错的数据,最简单的组合目标也要使NP的数量减少到最小,从而使交配的数量最少。统计目标更具挑战性。这里,显示了针对该问题的简单组合方法。通过枚举可能是兄弟姐妹和不能成为兄弟姐妹的人口成员的三胞胎,贪婪地构建推定的同胞组,将它们合并,直到无法进行进一步合并为止。简单的算法在很小的运行时间中就可以与整数编程方法相比,甚至胜过整数编程方法。此外,这些方法很有可能在继承和交配的简单而标准的概率模型下找到正确的同胞群。因此,在问题的“典型”实例中,原始问题的NP硬度得到了改善。这种现象是各种各样的生物信息学问题所共有的,因此将对如何响应这一发现进行讨论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号