首页> 外文会议>IEEE International Symposium on Parallel and Distributed Processing >An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs
【24h】

An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs

机译:多个GPU的超快可扩展的多核主题诊断算法

获取原文

摘要

The identification of genome-wide transcription factor binding sites is a fundamental and crucial problem to fully understand the transcriptional regulatory processes. However, the high computational cost of many motif discovery algorithms heavily constraints their application for large-scale datasets. The rapid growth of genomic sequences and gene transcription data further deteriorates the situation and establishes a strong requirement for time-efficient scalable motif discovery algorithms. The emergence of many-core architectures, typically CUDA-enabled GPUs, provides an opportunity to reduce the execution time by an order of magnitude without the loss of accuracy. In this paper, we present mCUDA-MEME, an ultrafast scalable many-core motif discovery algorithm for multiple GPUs based on the MEME algorithm. Our algorithm is implemented using a hybrid combination of the CUDA, OpenMP and MPI parallel programming models in order to harness the powerful compute capability of modern GPU clusters. At present, our algorithm supports OOPS and ZOOPS models, which are sufficient for most motif discovery applications. mCUDA-MEME achieves significant speedups for the starting point search stage (and the overall execution) when benchmarked, using real datasets, against parallel MEME running on 32 CPU cores. Speedups of up to 1.4 (1.1) on a single GPU of a Fermi-based Tesla S2050 quad-GPU computing system and up to 10.8 (8.3) on the eight GPUs of a two Tesla S2050 system were observed. Furthermore, our algorithm shows good scalability with respect to dataset size and the number of GPUs (availability:https://sites.google.com/site/yongchaosoftware/mc uda-meme).
机译:鉴定基因组转录因子结合位点是完全理解转录调控程序的基本和关键问题。然而,许多主题发现算法的高计算成本大量限制了他们对大型数据集的应用。基因组序列和基因转录数据的快速生长进一步恶化了这种情况,并对节省可扩展的主题发现算法建立了强烈要求。许多核心架构的出现,通常是支持CUDA的GPU,提供了一个机会,以便在没有准确性损失的情况下将执行时间减小到数量级。本文基于MEME算法,我们呈现MCUDA-MEME,超快可扩展的多个GPU的超快可伸缩的多核基序算法。我们的算法使用CUDA,OpenMP和MPI并联编程模型的混合组合来实现,以利用现代GPU集群的强大计算能力。目前,我们的算法支持哎呀和Zoops模型,这足以用于大多数主题发现应用程序。 Mcuda-Meme在使用真实数据集中与在32个CPU核心上运行的并行MEME进行基准测试时,可以实现显着的加速度,用于启动点搜索阶段(和整体执行)。在基于FERMI的TESLA S2050 Quad-GPU计算系统的单个GPU上的加速高达1.4(1.1)在两个TESLA S2050系统的八个GPU上的单个GPU上进行了高达10.8(8.3)。此外,我们的算法对数据集大小和GPU的数量显示了良好的可扩展性(可用性:https://sites.google.com/site/yongchaosoftware/mc uda-meme)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号