首页> 外文期刊>Algorithms for Molecular Biology >Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance
【24h】

Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance

机译:罗宾逊 - 福尔姆斯距离下线发育树完成的线性时间算法

获取原文
           

摘要

We consider two fundamental computational problems that arise when comparing phylogenetic trees, rooted or unrooted, with non-identical leaf sets. The first problem arises when comparing two trees where the leaf set of one tree is a proper subset of the other. The second problem arises when the two trees to be compared have only partially overlapping leaf sets. The traditional approach to handling these problems is to first restrict the two trees to their common leaf set. An alternative approach that has shown promise is to first complete the trees by adding missing leaves, so that the resulting trees have identical leaf sets. This requires the computation of an optimal completion that minimizes the distance between the two resulting trees over all possible completions. We provide optimal linear-time algorithms for both completion problems under the widely-used Robinson–Foulds (RF) distance measure. Our algorithm for the first problem improves the time complexity of the current fastest algorithm from quadratic (in the size of the two trees) to linear. No algorithms have yet been proposed for the more general second problem where both trees have missing leaves. We advance the study of this general problem by proposing a useful restricted version of the general problem and providing optimal linear-time algorithms for the restricted version. Our experimental results on biological data sets suggest that completion-based RF distances can be very different compared to traditional RF distances.
机译:我们考虑使用非相同叶片叶片,从而在比较系统发育树木,根生根或未引发时出现的两个基本计算问题。当比较一棵树的叶子组是另一棵树的适当子集时,第一个问题是出现的。当要比较两棵树木时,第二个问题仅具有部分重叠的叶片集。处理这些问题的传统方法是首先将两棵树限制在他们的常见叶子集中。一种替代方法,所示的方法是首先通过添加缺失的叶子来完成树木,使得所得到的树具有相同的叶片集。这需要计算最佳完成,从而最大限度地减少所有可能完成的两个所产生的树木之间的距离。我们为在广泛使用的Robinson-Founds(RF)距离测量下的完井问题提供最佳的线性时间算法。我们的第一个问题的算法可以提高当前最快算法的时间复杂性从二次(两棵树的大小)到线性。尚未提出任何算法,因为这两个树都缺少叶子。我们通过提出一般问题的有用限制版本来推进该一般问题的研究,并为受限版本提供最佳线性时间算法。我们对生物数据集的实验结果表明,与传统的RF距离相比,基于完成的RF距离可以非常不同。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号