HSPp-BLAST: Highly Scalable Parallel PSI-BLAST for Very Large-scale Sequence Searches

机译：HSPP-BLAST：非常大规模序列搜索的高度可扩展并行PSI-BLAST

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Based on recent published articles, the growth of genomic data has overtaken and outpaced both performance improvements of storage technologies and processing power due to the revolutionary advancements of next generation sequencing technologies. By bringing down the costs and increasing throughput by many orders of magnitude with sequencing technologies, data is doubling every 9 months resulting in the exponential growth of genomic data in recent years. However, data analysis becomes increasingly difficult and can be prohibitive, as existing bioinformatics tools developed in the past decade focus mainly on desktops, workstations and small clusters that have limited capabilities. Improving the performance and scalability of such tools is critical to transforming ever-growing raw genomic data into biological knowledge containing invaluable information directly related to human health. This paper describes a new software application which includes optimization techniques improving the scalability of a most widely used bioinformatics tool "PSI-BLAST" on advanced parallel architectures, pushing the envelope of biological data analysis. We show that our improvements allow near-linear scaling to tens of thousands of processing cores, up to the maximum non-capability size on current petaflop supercomputers. This new tool increases by 5 orders of magnitude the amount of genomics data that can be processed per hour.

机译：基于近期公布的文章，由于下一代测序技术的革命性进展，基因组数据的增长已经超越并超越了存储技术和处理能力的性能改进。通过利用测序技术降低成本并提高吞吐量，通过测序技术，数据近年来每9个月加倍，导致近年来基因组数据的指数增长。然而，数据分析变得越来越困难，可以让人望而却步，因为现有的生物信息学工具在过去十年主要集中于台式机，工作站和有能力有限的小集群发展。提高这些工具的性能和可扩展性对于将永远生长的原始基因组数据转化为含有与人类健康直接相关的宝贵信息的生物学知识。本文介绍了一种新的软件应用程序，包括优化技术，提高高级并行架构上使用最广泛使用的生物信息刀具“PSI-BLAST”的可扩展性，推动生物数据分析的包络。我们表明，我们的改进允许近千分缩放到数万个加工核心，直至最大的PETAFLOP超级计算机上的最大不可能大小。这个新工具的数量数量增加了5个数量级，可以每小时处理的基因组数据量。

著录项

来源
《International Conference on Bioinformatics and Computational Biology》|2012年||共6页
会议地点
作者
Bhanu Rekepalli; Aaron Vose; Paul Giblock;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 Q811.4-532;
关键词
Large-scale; Sequence; Searches;

机译：大规模;序列;搜索;

相似文献

外文文献
中文文献
专利

1. MODYLAS: A Highly Parallelized General-Purpose Molecular Dynamics Simulation Program for Large-Scale Systems with Long-Range Forces Calculated by Fast Multipole Method (FMM) and Highly Scalable Fine-Grained New Parallel Processing Algorithms [J] . Yoshimichi Andoh, Noriyuki Yoshii, Kazushi Fujimoto Journal of chemical theory and computation: JCTC . 2013,第7期

机译：MODYLAS：具有并行力的大型多用途通用分子动力学仿真程序，该程序由快速多极方法（FMM）和高度可扩展的细粒度新并行处理算法计算而得
2. Revisiting Myosin Families through Large-scale Sequence Searches Leads to the Discovery of New Myosins [J] . Shaik Naseer Pasha, Iyer Meenakshi, Ramanathan Sowdhamini Evolutionary Bioinformatics . 2016,第15期

机译：通过大规模的序列搜索重新访问肌球蛋白家族导致发现新的肌球蛋白
3. A highly solid model boundary preserving method for large-scale parallel 3D Delaunay meshing on parallel computers [J] . Xiang Chen, Li Chen, Maode Shi Computer-Aided Design . 2015,第Null期

机译：并行计算机上大规模并行3D Delaunay网格划分的高度实体模型边界保留方法
4. HSPp-BLAST: Highly Scalable Parallel PSI-BLAST for Very Large-scale Sequence Searches [C] . Bhanu Rekepalli, Aaron Vose, Paul Giblock International conference on bioinformatics and computational biology . 2012

机译：HSPp-BLAST：用于大规模序列搜索的高度可扩展并行PSI-BLAST
5. GPU-Based Parallel Algorithms With Architecture-Aware Optimization for Large-Scale Process Simulation of Biological Pathways and High-Throughput Homologous Sequence Search [D] . Jiang, Hanyu. 2018

机译：基于GPU的并行算法，具有架构感知优化，用于生物途径和高通量同源序列搜索的大规模过程仿真
6. Revisiting Myosin Families Through Large-scale Sequence Searches Leads to the Discovery of New Myosins [O] . Shaik Naseer Pasha, Iyer Meenakshi, Ramanathan Sowdhamini 2016

机译：通过大规模的序列搜索重新访问肌球蛋白家族导致发现新的肌球蛋白。
7. Bundle CDN: A Highly Parallelized Approach for Large-scale ℓ1-regularized Logistic Regression [O] . Yatao Bian, Xiong Li, Mingqi Cao, 2014

机译：捆绑CDN：一种高度并行化的大规模ℓ1正则化Logistic回归方法

HSPp-BLAST: Highly Scalable Parallel PSI-BLAST for Very Large-scale Sequence Searches

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅