首页> 外文期刊>Methods in Ecology and Evolution >Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library
【24h】

Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library

机译:系统发育比较模型的平行似然计算:Splitt C Plus加上库

获取原文
获取原文并翻译 | 示例
           

摘要

Phylogenetic comparative models (PCMs) have been used to study macroevolutionary patterns, to characterize adaptive phenotypic landscapes, to quantify rates of evolution, to measure trait heritability, and to test various evolutionary hypotheses. A major obstacle to applying these models has been the complexity of evaluating their likelihood function. Recent works have shown that for many PCMs, the likelihood can be obtained in time proportional to the size of the tree based on post-order tree traversal, also known as pruning. Despite this progress, inferring complex multi-trait PCMs on large trees remains a time-intensive task. Here, we study parallelizing the pruning algorithm as a generic technique for speeding-up PCM-inference. We implement several parallel traversal algorithms in the form of a generic C++ library for Serial and Parallel LIneage Traversal of Trees (SPLITT). Based on SPLITT, we provide examples of parallel likelihood evaluation for several popular PCMs, ranging from a single-trait Brownian motion model to complex multi-trait Ornstein-Uhlenbeck and mixed Gaussian phylogenetic models. Using the phylogenetic Ornstein-Uhlenbeck mixed model (POUMM) as a showcase, we run benchmarks on up to 24 CPU cores, reporting up to an order of magnitude parallel speed-up for the likelihood calculation on simulated balanced and unbalanced trees of up to 100,000 tips with up to 16 traits. Noticing that the parallel speed-up depends on multiple factors, the SPLITT library is capable to automatically select the fastest traversal strategy for a given hardware, tree-topology, and data. Combining SPLITT likelihood calculation with adaptive Metropolis sampling on real data, we show that the time for Bayesian POUMM inference on a tree of 10,000 tips can be reduced from several days to less than an hour. We conclude that parallel pruning effectively accelerates the likelihood calculation and, thus, the statistical inference of Gaussian phylogenetic models. For time-intensive Bayesian inferences, we recommend combining this technique with adaptive Metropolis sampling. Beyond Gaussian models, the parallel tree traversal can be applied to numerous other models, including discrete trait and birth-death population dynamics models. Currently, SPLITT supports multi-core shared memory architectures, but can be extended to distributed memory architectures as well as graphical processing units.
机译:系统发育比较模型(PCM)已被用于研究宏观调度模式,以表征适应性表型景观,以量化进化率,以测量特征遗传性,并测​​试各种进化假设。应用这些模型的主要障碍是评估其似然函数的复杂性。最近的作品已经表明,对于许多PCM,可以基于秩序的树木遍历遍历,以与树的大小成比例的可能性,也称为修剪。尽管这一进展,但大型树木上的复杂多特质PCM仍然是一项时间密集的任务。在这里,我们研究并将修剪算法并行化为加速PCM推理的通用技术。我们以串行C ++库的形式实现若干并行遍历算法,用于树木(SPLIT)的串行和并行谱系遍历。基于SPLITT,我们为几个流行的PCM提供了平行似然评估的例子,从单个特征褐色运动模型到复杂的多特征ornstein-uhlenbeck和混合高斯系统发育模型。使用Phylogy ornstein-Uhlenbeck混合模型(POUMM)作为展示,我们在最多24个CPU内核上运行基准,报告最大并行加速的数量级,用于模拟平衡和不平衡树的可能性高达100,000最多16个特征的提示。注意到并行加速取决于多个因素,Splitt库能够自动为给定的硬件,树拓扑和数据选择最快的遍历策略。将Splitt似然计算与自适应大都市对实际数据采样相结合,我们表明贝叶斯POUMM推理在10,000个提示树上的时间可以从几天减少到不到一小时。我们得出结论,平行修剪有效地加速了高斯系统发育模型的统计学推理。对于延时贝叶斯推论,我们建议将这种技术与自适应大都市采样相结合。除了高斯模型之外,平行树遍历可以应用于许多其他型号,包括离散特征和出生死亡人口动力学模型。目前,SPLITT支持多核共享内存架构,但可以扩展到分布式内存架构以及图形处理单元。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号