首页> 外文会议>Parallel computing technologies >Multi-granularity Parallel Computing in a Genome-Scale Molecular Evolution Application
【24h】

Multi-granularity Parallel Computing in a Genome-Scale Molecular Evolution Application

机译:基因组规模的分子进化应用中的多粒度并行计算

获取原文
获取原文并翻译 | 示例

摘要

Previously [1], we reported a coarse-grained parallel computational apprbach to identifying rare molecular evolutionary events often referred to as horizontal gene transfers. Very high degrees of parallelism (up to 65x speedup on 4,096 processors) were reported, yet the overall execution time for a realistic problem size was still on the order of 12 days. With the availability of large numbers of compute clusters, as well as genomic sequence from more than 2,000 species containing as many as 35,000 genes each, and trillions of sequence nucleotides in all, we demonstrated the computational feasibility of a method to examine "clusters" of genes using phylogenetic tree similarity as a distance metric. A full serial solution to this problem requires years of CPU time, yet only makes modest IPC and memory demands; thus, it is an ideal candidate for a grid computing approach involving low-cost compute nodes. This paper now describes a multiple granularity parallelism solution that includes exploitation of multi-core shared memory nodes to address fine-grained aspects in the tree-clustering phase of our previous deployment of XenoCluster 1.0. In addition to benchmarking results that show up to 80% speedup efficiency on 8 CPU cores, we report on the biological accuracy and relevance of our results compared to a reported set of known xenologs in yeast.
机译:以前[1],我们报道了一种粗粒度的并行计算方法来识别罕见的分子进化事件,通常称为水平基因转移。据报道,并行度非常高(在4,096个处理器上的速度提高了65倍),但实际问题大小的总执行时间仍为12天左右。随着大量计算集群的可用性,以及来自2,000多个物种的基因组序列,每个物种包含多达35,000个基因,以及总计数千亿个序列核苷酸,我们证明了一种检查基因组“簇”的方法的计算可行性。使用系统树相似性作为距离度量的基因。完整的串行解决方案需要多年的CPU时间,但对IPC和内存的需求很少。因此,它是涉及低成本计算节点的网格计算方法的理想候选者。现在,本文介绍了一种多粒度并行解决方案,其中包括利用多核共享内存节点来解决XenoCluster 1.0先前部署的树集群阶段中的细粒度方面。除了可以在8个CPU内核上显示高达80%的加速效率的基准测试结果外,我们还报告了与已知的酵母异种同源物相比,我们的结果在生物学上的准确性和相关性。

著录项

  • 来源
    《Parallel computing technologies》|2009年|P.49-59|共11页
  • 会议地点 Novosibirsk(RU);Novosibirsk(RU)
  • 作者单位

    Coordinated Laboratory for Computational Genomics University of Iowa, Iowa City, IA 52242 USA Department of Electrical and Computer Engineering University of Iowa, Iowa City, IA 52242 USA;

    Coordinated Laboratory for Computational Genomics University of Iowa, Iowa City, IA 52242 USA;

    Center for Bioinformatics and Computational Biology University of Iowa, Iowa City, IA 52242 USA Coordinated Laboratory for Computational Genomics University of Iowa, Iowa City, IA 52242 USA Department of Electrical and Computer Engineering University of Iowa, Iowa City, IA 52242 USA Department of Ophthalmology and Visual Sciences, University of Iowa, Iowa City, IA 52242 USA;

    Center for Bioinformatics and Computational Biology University of Iowa, Iowa City, IA 52242 USA Coordinated Laboratory for Computational Geno;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 电子数字计算机(不连续作用电子计算机);理论、方法;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号