首页> 外文会议>WABI 2011 >Efficiently Solvable Perfect Phytogeny Problems on Binary and fe-State Data with Missing Values
【24h】

Efficiently Solvable Perfect Phytogeny Problems on Binary and fe-State Data with Missing Values

机译:缺失值的二元和FE状态数据有效可解决完美的植物生成问题

获取原文

摘要

Abstract. The perfect phylogeny problem is of central importance to both evolutionary biology and population genetics. Missing values are a common occurrence in both sequence and genotype data. In their presence, the problem of finding a perfect phylogeny is NP-hard, even for binary characters [24]. We extend the utility of the perfect phylogeny by introducing new efficient algorithms for broad classes of binary and multi-state data with missing values.Specifically, we address the rich data hypothesis introduced by Halperin and Karp [11] for the binary perfect phylogeny problem with missing data. We give an efficient algorithm for enumerating phylogenies compatible with characters satisfying the rich data hypothesis. This algorithm is useful for computing the probability of data with missing values under the coalescent model.In addition, we use the partition intersection (PI) graph and chordal graph theory to generalize the rich data hypothesis to multi-state characters with missing values. For a bounded number of states, k, we provide a fixed parameter tractable algorithm for the fc-state perfect phylogeny problem with missing data. Our approach reduces missing data problems to problems on complete data. Finally, we characterize a commonly observed condition, an m-clique in the PI graph, under which a perfect phylogeny can be found efficiently for binary characters with missing values. We evaluate our results with extensive empirical analysis using two biologically motivated generative models of character data.
机译:抽象的。完美的系统发育问题是至关重要既进化生物学和群体遗传学。缺失值是在这两个序列和基因型数据经常发生的。在它们的存在,寻找一个完美的系统发育的问题是NP-hard的,即使对于二进制字符[24]。我们通过与失踪values.Specifically引入了两大类二进制和多态数据的新的高效算法的完美系统发育的实用扩展,我们应对霍尔珀林和卡普[11]二进制完美系统发育问题,推出了丰富的数据假说缺失数据。我们给枚举与满足丰富的数据假设字符兼容的系统发育的高效算法。该算法是用于与聚结剂model.In加成下缺失值计算数据的概率是有用的,我们使用的分区相交点(PI)图和弦图理论概括出丰富的数据假设到多状态字符缺失值。对于有限数量的状态,K,我们提供了一个固定参数可解算法为FC-状态完美的系统发育问题丢失数据。我们的方法可以降低数据丢失的问题就完整的内容问题。最后,我们描述一个通常观察到的条件下,一个m集团在PI图中,在其下一个完美的系统发育可以有效地发现用于与缺失值的二进制字符。我们评估我们使用字符数据的两个生物动力生成模型广泛的实证分析结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号