An Ultrafast Scalable Many-Core Motif Discovery Algorithm for Multiple GPUs

机译：适用于多个GPU的超快速可扩展多核主题发现算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The identification of genome-wide transcription factor binding sites is a fundamental and crucial problem to fully understand the transcriptional regulatory processes. However, the high computational cost of many motif discovery algorithms heavily constraints their application for large-scale datasets. The rapid growth of genomic sequences and gene transcription data further deteriorates the situation and establishes a strong requirement for time-efficient scalable motif discovery algorithms. The emergence of many-core architectures, typically CUDA-enabled GPUs, provides an opportunity to reduce the execution time by an order of magnitude without the loss of accuracy. In this paper, we present mCUDA-MEME, an ultrafast scalable many-core motif discovery algorithm for multiple GPUs based on the MEME algorithm. Our algorithm is implemented using a hybrid combination of the CUDA, OpenMP and MPI parallel programming models in order to harness the powerful compute capability of modern GPU clusters. At present, our algorithm supports OOPS and ZOOPS models, which are sufficient for most motif discovery applications. mCUDAMEME achieves significant speedups for the starting point search stage (and the overall execution) when benchmarked, using real datasets, against parallel MEME running on 32 CPU cores. Speedups of up to 1.4 (1.1) on a single GPU of a Fermi-based Tesla S2050 quad-GPU computing system and up to 10.8 (8.3) on the eight GPUs of a two Tesla S2050 system were observed. Furthermore, our algorithm shows good scalability with respect to dataset size and the number of GPUs (availability:https://sites.google.com/site/yongchaosoftware/mc uda-meme).

机译：全基因组转录因子结合位点的鉴定是充分理解转录调控过程的基本和关键问题。但是，许多主题发现算法的高计算成本严重限制了它们在大规模数据集中的应用。基因组序列和基因转录数据的快速增长进一步恶化了这种情况，并提出了对时间有效的可扩展基序发现算法的强烈要求。许多核心架构（通常是支持CUDA的GPU）的出现提供了将执行时间减少一个数量级而又不降低准确性的机会。在本文中，我们提出了mCUDA-MEME，这是一种基于MEME算法的针对多个GPU的超快速可扩展多核主题发现算法。我们的算法是使用CUDA，OpenMP和MPI并行编程模型的混合组合来实现的，以利用现代GPU集群的强大计算能力。目前，我们的算法支持OOPS和ZOOPS模型，足以满足大多数主题发现应用的需求。当使用实际数据集对32个CPU内核上运行的并行MEME进行基准测试时，mCUDAMEME可以大大提高起点搜索阶段（以及整体执行）的速度。在基于Fermi的Tesla S2050四GPU计算系统的单个GPU上，加速比达到1.4（1.1），而在两个Tesla S2050系统的八个GPU上，加速比达到10.8（8.3）。此外，我们的算法在数据集大小和GPU数量方面显示出良好的可扩展性（可用性：https：//sites.google.com/site/yongchaosoftware/mc uda-meme）。

著录项

来源
《2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum》|2011年|p.428-434|共7页
会议地点
作者
Liu Yongchao; Schmidt Bertil; Maskell Douglas L.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Scalable Parallel Algorithm of Multiple-Relaxation-Time Lattice Boltzmann Method with Large Eddy Simulation on Multi-GPUs [J] . Xu Lei, Song Anping, Zhang Wu Scientific programming . 2018,第PTa1期

机译：多GPU上具有大涡流仿真的多重松弛时间格子Boltzmann方法的可扩展并行算法
2. Scalable Parallel Algorithm of Multiple-Relaxation-Time Lattice Boltzmann Method with Large Eddy Simulation on Multi-GPUs [J] . Lei Xu, Anping Song, Wu Zhang Scientific programming . 2018,第1期

机译：多GPU上具有大涡模拟的多松弛时间格子Boltzmann方法的可扩展并行算法
3. Analysing the scalability of multiobjective evolutionary algorithms when solving the motif discovery problem [J] . David L. Gonzalez-Alvarez, Miguel A. Vega-Rodriguez Journal of Global Optimization . 2013,第2期

机译：解决主题发现问题时分析多目标进化算法的可扩展性
4. An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs [C] . Yongchao Liu, Bertil Schmidt, Douglas L. Maskell IEEE International Symposium on Parallel and Distributed Processing . 2011

机译：多个GPU的超快可扩展的多核主题诊断算法
5. Optimizing Algorithms for Multiple GPUs [D] . Pandya, Vraj. 2017

机译：多个GPU的优化算法
6. BROCCOLI: Software for fast fMRI analysis on many-core CPUs and GPUs [O] . Anders Eklund, Paul Dufort, Mattias Villani, 2014

机译：BROCCOLI：用于在多核CPU和GPU上进行快速fMRI分析的软件
7. Efficient Algorithms for Model-Based Motif Discovery from Multiple Sequences [O] . Bin Fu 2014

机译：从多个序列中基于模型的主题发现的高效算法

An Ultrafast Scalable Many-Core Motif Discovery Algorithm for Multiple GPUs

摘要

著录项

相似文献

相关主题

期刊订阅