首页> 外文会议>Asia-Pacific ioinformatics Conference >MrsRF: an efficient MapReduce algorithm for analyzing large collections of evolutionary trees
【24h】

MrsRF: an efficient MapReduce algorithm for analyzing large collections of evolutionary trees

机译:MRSRF:一种有效的MapReduce算法,用于分析大量进化树木

获取原文

摘要

Background: MapReduce is a parallel framework that has been used effectively to design large-scale parallel applications for large computing clusters. In this paper, we evaluate the viability of the MapReduce framework for designing phylogenetic applications. The problem of interest is generating the all-to-all Robinson-Foulds distance matrix, which has many applications for visualizing and clustering large collections of evolutionary trees. We introduce MrsRF (MapReduce Speeds up RF), a multi-core algorithm to generate a t * t Robinson-Foulds distance matrix between t trees using the MapReduce paradigm.Results: We studied the performance of our MrsRF algorithm on two large biological trees sets consisting of 20,000 trees of 150 taxa each and 33,306 trees of 567 taxa each. Our experiments show that MrsRF is a scalable approach reaching a speedup of over18 on 32 total cores. Our results also show that achieving top speedup on a multi-core cluster requires different cluster configurations. Finally, we show how to use an RF matrix to summarize collections of phylogenetic trees visually.Conclusion: Our results show that MapReduce is a promising paradigm for developing multi-core phylogenetic applications. The results also demonstrate that different multi-core configurations must be tested in order to obtain optimum performance. We conclude that RF matrices play a critical role in developing techniques to summarize large collections of trees.
机译:Backgrounds:MapReduce是一个并行框架,已有效地用于为大型计算集群设计大规模并行应用。在本文中,我们评估了MapReduce框架设计的生存能力设计系统发育应用。感兴趣的问题是生成全面的罗宾逊 - FUNDS距离矩阵,它具有许多用于可视化和聚类大量进化树的应用。我们介绍MRSRF(MapReduce Speeds UP RF),使用MapReduce Paradigm.Results在T树上生成的多核算法,以在T树上生成T树上:我们研究了我们MRSRF算法在组成的两个大型生物树集中的表现每年有20,000棵树植物,每棵树和33,306棵树的567棵三棵树。我们的实验表明,MRSRF是一个可扩展的方法,达到32个总核心超过18件的加速。我们的结果还表明,在多核群集中实现顶级加速需要不同的群集配置。最后,我们展示了如何使用RF矩阵总结系统发育树的集合。结论:我们的结果表明,MapReduce是开发多核系统发育应用的有希望的范式。结果还表明必须测试不同的多核配置,以获得最佳性能。我们得出结论,RF矩阵在开发总结大量树木的技术方面发挥着关键作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号