首页> 外文会议>Workshop on Algorithm Engineering and Experiments >On the Scalability of Computing Triplet and Quartet Distances
【24h】

On the Scalability of Computing Triplet and Quartet Distances

机译:关于计算三联和四重奏距离的可扩展性

获取原文

摘要

In this paper we present an experimental evaluation of the algorithms by Brodal et al. [SODA 2013] for computing the triplet and quartet distance measures between two leaf labelled rooted and unrooted trees of arbitrary degree, respectively. The algorithms count the number of rooted tree topologies over sets of three leaves (triplets) and unrooted tree topologies over four leaves (quartets), respectively, that have different topologies in the two trees. The algorithms by Brodal et al. maintain a long sequence of variables (hundreds for quartets) for counting different cases to be considered by the algorithm, making it unclear if the algorithms would be of theoretical interest only. In our experimental evaluation of the algorithms the typical overhead per node is about 2 KB and 10 KB per node in the input trees for triplet and quartet computations, respectively. This allows us to compute the distance measures for trees with up to millions of nodes. The limiting factor is the amount of memory available. With 31 GB of memory all our input instances can be solved within a few minutes. In the algorithm by Brodal et al. a few choices were made, where alternative solutions possibly could improve the algorithm, in particular for quartet distance computations. For quartet computations we expand the algorithm to also consider alternative computations, and make two observations: First we observe that the running time can be improved from O(max(d_1, d_2)·n·lg n) to O(min(d_1, d_2)·n·lg n), where n is the number of leaves in the two trees, and d_1 and d_2 are the maximum degrees of the nodes in the two trees, respectively. Secondly, by taking a different approach to counting the number of disagreeing quartets we can reduce the number of calculations needed to calculate the quartet distance, improving both the running time and the space requirement by our algorithm by a constant factor.
机译:在本文中,我们通过Brodal等人提出了对算法的实验评价。 [SODA 2013]用于计算三重叶片之间的三重型和四重奏距离,分别标记为植根的叶片和任意程度的巨大树木。该算法分别计算三叶(三叶)和四叶(四重奏)上的三个叶子(三叶)和大型树拓扑结构的根拓粒子的数量,这两棵树中具有不同的拓扑。 Brodal等人的算法。保持长期变量(数百个四重奏),用于计算算法考虑不同的情况,如果算法仅是理论兴趣的情况下不清楚。在我们对算法的实验评估中,每个节点的典型开销分别为三联和四重奏计算的输入树中的每个节点大约2kb和10kb。这允许我们计算最多数百万节点的树木的距离测量。限制因素是可用的内存量。拥有31 GB的内存,所有的输入实例都可以在几分钟内解决。在Brodal等人的算法中。进行了几种选择,其中替代解决方案可能可以改善算法,特别是对于四重距离计算。对于四重奏计算,我们扩展算法还考虑替代计算,并进行两个观察结果:首先,我们观察到运行时间可以从O(最大(d_1,d_2)·n·lg n)到o(min(d_1, D_2)·n·lg n),其中n是两棵树中叶片的数量,d_1和d_2分别是两棵树中节点的最大程度。其次,通过采取不同的方法来计算不同意四重奏的数量,我们可以减少计算四重奏距离所需的计算次数,通过恒定因子改善我们的算法的运行时间和空间要求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号