Some phylogenetic comparative analyses rely on simulation procedures that use a large number of phylogenetic trees to estimate evolutionary correlations. Because of the computational burden of processing hundreds of thousands of trees, unless this procedure is efficiently implemented, the analyses are of limited applicability. In this paper, we present a highly parallel and efficient implementation for calculating phylogenetic distances. By using the power of GPU computing and a massive number of threads we are able to achieve performance gains up to 243x when compared to a sequential implementation of the same procedures. New data structures and algorithms are also presented so as to efficiently process irregular pointer-based data structures such as trees. In particular, a GPU-based parallel implementation of the lowest common ancestor (LCA) problem is presented. Moreover, the implementation makes intensive use of bitmaps to efficiently encode paths to the tree nodes, and optimize memory transactions by working with data structures that favors coalesced memory accesses. Our results open up the possibility of dealing with large datasets in evolutionary and ecological analyses.
展开▼