Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library

Mitov Venelin; Stadler Tanja

首页> 外文期刊>Methods in Ecology and Evolution >Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library

【24h】

Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library

机译：系统发育比较模型的平行似然计算：Splitt C Plus加上库

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Phylogenetic comparative models (PCMs) have been used to study macroevolutionary patterns, to characterize adaptive phenotypic landscapes, to quantify rates of evolution, to measure trait heritability, and to test various evolutionary hypotheses. A major obstacle to applying these models has been the complexity of evaluating their likelihood function. Recent works have shown that for many PCMs, the likelihood can be obtained in time proportional to the size of the tree based on post-order tree traversal, also known as pruning. Despite this progress, inferring complex multi-trait PCMs on large trees remains a time-intensive task. Here, we study parallelizing the pruning algorithm as a generic technique for speeding-up PCM-inference. We implement several parallel traversal algorithms in the form of a generic C++ library for Serial and Parallel LIneage Traversal of Trees (SPLITT). Based on SPLITT, we provide examples of parallel likelihood evaluation for several popular PCMs, ranging from a single-trait Brownian motion model to complex multi-trait Ornstein-Uhlenbeck and mixed Gaussian phylogenetic models. Using the phylogenetic Ornstein-Uhlenbeck mixed model (POUMM) as a showcase, we run benchmarks on up to 24 CPU cores, reporting up to an order of magnitude parallel speed-up for the likelihood calculation on simulated balanced and unbalanced trees of up to 100,000 tips with up to 16 traits. Noticing that the parallel speed-up depends on multiple factors, the SPLITT library is capable to automatically select the fastest traversal strategy for a given hardware, tree-topology, and data. Combining SPLITT likelihood calculation with adaptive Metropolis sampling on real data, we show that the time for Bayesian POUMM inference on a tree of 10,000 tips can be reduced from several days to less than an hour. We conclude that parallel pruning effectively accelerates the likelihood calculation and, thus, the statistical inference of Gaussian phylogenetic models. For time-intensive Bayesian inferences, we recommend combining this technique with adaptive Metropolis sampling. Beyond Gaussian models, the parallel tree traversal can be applied to numerous other models, including discrete trait and birth-death population dynamics models. Currently, SPLITT supports multi-core shared memory architectures, but can be extended to distributed memory architectures as well as graphical processing units.

机译：系统发育比较模型（PCM）已被用于研究宏观调度模式，以表征适应性表型景观，以量化进化率，以测量特征遗传性，并测试各种进化假设。应用这些模型的主要障碍是评估其似然函数的复杂性。最近的作品已经表明，对于许多PCM，可以基于秩序的树木遍历遍历，以与树的大小成比例的可能性，也称为修剪。尽管这一进展，但大型树木上的复杂多特质PCM仍然是一项时间密集的任务。在这里，我们研究并将修剪算法并行化为加速PCM推理的通用技术。我们以串行C ++库的形式实现若干并行遍历算法，用于树木（SPLIT）的串行和并行谱系遍历。基于SPLITT，我们为几个流行的PCM提供了平行似然评估的例子，从单个特征褐色运动模型到复杂的多特征ornstein-uhlenbeck和混合高斯系统发育模型。使用Phylogy ornstein-Uhlenbeck混合模型（POUMM）作为展示，我们在最多24个CPU内核上运行基准，报告最大并行加速的数量级，用于模拟平衡和不平衡树的可能性高达100,000最多16个特征的提示。注意到并行加速取决于多个因素，Splitt库能够自动为给定的硬件，树拓扑和数据选择最快的遍历策略。将Splitt似然计算与自适应大都市对实际数据采样相结合，我们表明贝叶斯POUMM推理在10,000个提示树上的时间可以从几天减少到不到一小时。我们得出结论，平行修剪有效地加速了高斯系统发育模型的统计学推理。对于延时贝叶斯推论，我们建议将这种技术与自适应大都市采样相结合。除了高斯模型之外，平行树遍历可以应用于许多其他型号，包括离散特征和出生死亡人口动力学模型。目前，SPLITT支持多核共享内存架构，但可以扩展到分布式内存架构以及图形处理单元。

著录项

来源
《Methods in Ecology and Evolution》 |2019年第4期|共14页
作者
Mitov Venelin; Stadler Tanja;
展开▼
作者单位

Swiss Fed Inst Technol Dept Biosyst Sci &

Engn Basel Switzerland;

Swiss Fed Inst Technol Dept Biosyst Sci &

Engn Basel Switzerland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物科学;
关键词
continuous time Markov process; continuous trait; discrete character; pre-order traversal;

机译：连续时间马尔可夫进程;连续特征;离散性格;预购遍历;

相似文献

外文文献
中文文献
专利

1. Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library [J] . Mitov Venelin, Stadler Tanja Methods in Ecology and Evolution . 2019,第4期

机译：系统发育比较模型的平行似然计算：Splitt C Plus加上库
2. GeLL: a generalized likelihood library for phylogenetic models [J] . Money Daniel, Whelan Simon Bioinformatics . 2015,第14期

机译：GeLL：系统发育模型的广义似然库
3. Fast likelihood calculation for multivariate Gaussian phylogenetic models with shifts [J] . Theoretical Population Biology . 2020,第期

机译：多变量高斯系统发育模型的快速似然计算
4. Accelerating parallel maximum likelihood-based phylogenetic tree calculations using subtree equality vectors [C] . Alexandros P. Stamatakis, Thomas Ludwig, Harald Meier, ACM/IEEE conference on Supercomputing . 2002

机译：使用子树相等向量加快基于并行最大似然的系统树的计算
5. Efficient GPU Parallelization of the Agent-Based Models Using MASS CUDA Library [D] . Kosiachenko, Elizaveta. 2018

机译：使用质量CUDA文库的基于代理的模型的高效GPU并行化
6. PALM: A Paralleled and Integrated Framework for Phylogenetic Inference with Automatic Likelihood Model Selectors [O] . Shu-Hwa Chen, Sheng-Yao Su, Chen-Zen Lo, 2009

机译：PALM：具有自动可能性模型选择器的系统发生推理的并行集成框架
7. Parallel Likelihood Calculation for Phylogenetic Comparative Models: the SPLITT C++ Library [O] . Venelin Mitov, Tanja Stadler 2017

机译：系统发育比较模型的平行似然计算：Splitt C ++库

Parallel likelihood calculation for phylogenetic comparative models: The SPLITT C plus plus library

摘要

著录项

相似文献

相关主题

期刊订阅