【24h】

Robinson-Foulds Supertrees

机译:鲁滨逊福兹超级树

获取原文
       

摘要

Background Supertree methods synthesize collections of small phylogenetic trees with incomplete taxon overlap into comprehensive trees, or supertrees, that include all taxa found in the input trees. Supertree methods based on the well established Robinson-Foulds (RF) distance have the potential to build supertrees that retain much information from the input trees. Specifically, the RF supertree problem seeks a binary supertree that minimizes the sum of the RF distances from the supertree to the input trees. Thus, an RF supertree is a supertree that is consistent with the largest number of clusters (or clades) from the input trees. Results We introduce efficient, local search based, hill-climbing heuristics for the intrinsically hard RF supertree problem on rooted trees. These heuristics use novel non-trivial algorithms for the SPR and TBR local search problems which improve on the time complexity of the best known (na?ve) solutions by a factor of Θ(n) and Θ(n2) respectively (where n is the number of taxa, or leaves, in the supertree). We use an implementation of our new algorithms to examine the performance of the RF supertree method and compare it to matrix representation with parsimony (MRP) and the triplet supertree method using four supertree data sets. Not only did our RF heuristic provide fast estimates of RF supertrees in all data sets, but the RF supertrees also retained more of the information from the input trees (based on the RF distance) than the other supertree methods. Conclusions Our heuristics for the RF supertree problem, based on our new local search algorithms, make it possible for the first time to estimate large supertrees by directly optimizing the RF distance from rooted input trees to the supertrees. This provides a new and fast method to build accurate supertrees. RF supertrees may also be useful for estimating majority-rule(-) supertrees, which are a generalization of majority-rule consensus trees.
机译:背景超级树方法将具有不完整分类群的小型系统发育树的集合合成为综合树或超级树,其中包括输入树中找到的所有分类单元。基于公认的Robinson-Foulds(RF)距离的Supertree方法有可能构建保留从输入树中获取大量信息的Supertree。具体地说,RF超级树问题寻求一种二进制超级树,该二进制超级树将从超级树到输入树的RF距离之和最小化。因此,RF超级树是与输入树中最大数量的簇(或进化枝)一致的超级树。结果我们针对根上树上固有的硬RF超级树问题,引入了基于局部搜索的高效爬山试探法。这些启发式算法针对SPR和TBR局部搜索问题使用了新颖的非平凡算法,从而将最著名的(幼稚的)解的时间复杂度提高了Θ(n)和Θ(n 2 )(其中n是超树中的分类单元或叶子的数量)。我们使用新算法的实现来检查RF超树方法的性能,并将其与具有简约性(MRP)的矩阵表示形式以及使用四个超树数据集的三重态超树方法进行比较。我们的RF启发式方法不仅可以快速估计所有数据集中的RF超树,而且与其他超树方法相比,RF超树还保留了来自输入树的更多信息(基于RF距离)。结论基于我们新的本地搜索算法,我们对RF超级树问题的启发式方法使得首次通过直接优化从根输入树到超级树的RF距离来估计大型超级树成为可能。这提供了一种新的快速方法来构建准确的超级树。 RF超树也可能用于估计多数规则(-)超树,这是多数规则共识树的概括。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号