首页> 外文会议>IEEE International Conference on Computational Advances in Bio and Medical Sciences >Invited: Fast and theoretically strong algorithms for kinship discovery
【24h】

Invited: Fast and theoretically strong algorithms for kinship discovery

机译:邀请:快速和理论上强大的血缘关系发现算法

获取原文

摘要

In kinship inference, genealogical relationships among organisms, typically in naturally-occurring populations, based on genetic marker information are identified. This task is crucial to conservation of endangered species and to understand the diversity of populations. Some of the simplest problems in this domain are sib group and half-sibgroup discover. Natural objectives in this domain are statistical ones (such as maximum likelihood) and combinatorial one (such as parsimony). Unfortunately, even with error-free data, the simplest combinatorial objective, minimizing the number of matings, is NP-hard to approximate; the statistical objectives are even more challenging. Here, a simple combinatorial approach for the problem is shown. By enumerating triplets of population members that could be siblings and that could not be siblings, putative sibgroups are greedily constructed, merging them until no further mergings can occur. The simple algorithm performs comparably to or better than integer programming methods for the problem, in a tiny fraction of the runtime. Moreover, with high probability, these methods find the correct sibgroups, under a straightforward and standard probabilistic model of inheritance and mating. Hence, the NP-hardness of the original problem is ameliorated in "typical" instances of the problem. This phenomenon is common to a large variety of bioinformatics problems, so a discussion of how to respond to this observation is presented.
机译:在亲属性推理中,鉴定了基于基于遗传标记信息的生物体,通常在天然存在的群体中的基因关系。这项任务对于保护濒危物种并了解人口的多样性至关重要。此域中的一些最简单的问题是SIB组和半sibgroup发现。该领域的自然目标是统计统计数据(例如最大可能性)和组合(如副偶像)。遗憾的是,即使有无差错的数据,最简单的组合目标,最小化消耗的数量,也是近似的np;统计目标更具挑战性。这里,示出了用于问题的简单组合方法。通过枚举可能是兄弟姐妹的人口成员的三胞胎,这不能是兄弟姐妹,推定的SIBGroups是贪婪的构造,并在没有进一步的融合之前合并它们。简单的算法比运行时的微小分数相当于整数编程方法而比较或更好地执行。此外,具有高概率,这些方法在遗传和交配的直接和标准的概率模型下找到了正确的SIBGroups。因此,原始问题的NP硬度在问题的“典型”实例中得到了改善。这种现象对于各种各样的生物信息学问题是常见的,因此提出了如何响应该观察的讨论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号