Indexing Genomic Sequences on the IBM Blue Gene

机译：在IBM Blue基因上索引基因组序列

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With advances in sequencing technology and through aggressive sequencing efforts, DNA sequence data sets have been growing at a rapid pace. To gain from these advances, it is important to provide life science researchers with the ability to process and query large sequence data sets. For the past three decades, the suffix tree has served as a fundamental data structure in processing sequential data sets. However, tree construction times on large data sets have been excessive. While parallel suffix tree construction is an obvious solution to reduce execution times, poor locality of reference has limited parallel performance. In this paper, we show that through careful parallel algorithm design, this limitation can be removed, allowing tree construction to scale to massively parallel systems like the IBM Blue Gene. We demonstrate that the entire Human genome can be indexed on 1024 processors in under 15 minutes.

机译：随着测序技术的进步以及积极的测序工作，DNA序列数据集正在快速增长。为了从这些进步中获益，向生命科学研究者提供处理和查询大型序列数据集的能力非常重要。在过去的三十年中，后缀树已成为处理顺序数据集的基本数据结构。但是，大型数据集上树的构建时间过长。虽然并行后缀树构造是减少执行时间的明显解决方案，但较差的引用局部性却限制了并行性能。在本文中，我们表明通过精心设计的并行算法，可以消除此限制，从而使树的构建可以扩展到大规模并行系统（如IBM Blue Gene）。我们证明了整个人类基因组可以在15分钟内在1024个处理器上建立索引。

著录项

来源
《International conference on high performance computing, networking, storage and analysis 2009》|2009年|P.689-699|共11页
会议地点 Portland OR(US);Portland OR(US)
作者
Amol Ghoting; rnKonstantin Makarychev;
展开▼
作者单位

IBM T. J. Watson Research Center Yorktown Heights, NY 10598, USA;

rnIBM T. J. Watson Research Center Yorktown Heights, NY 10598, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络;
关键词

相似文献

外文文献
中文文献
专利

1. COMPARATIVE GENOMIC STUDIES OF INFLUENZA A VIRUSES PERFORMED ON BLUEGENE P SUPERCOMPUTER: PART 1. CONSERVATIVE NUCLEOTIDE SEQUENCES IN INFLUENZA A VIRUS GENOMES REVEALED BY MULTIPLE SEQUENCE ALIGNMENT [J] . Vera Maximova, Kiril Kirilov, Stoyan Markov, Biotechnology & Biotechnological Equipment . 2011,第4期

机译：在BLUEGENE P超计算机上进行的流感病毒A病毒的比较基因组学研究：第1部分。流感病毒基因组中的保守核苷酸序列通过多序列比对揭示
2. Comparative Genomic Studies of Influenza a Viruses Performed on Bluegene P Supercomputer: Part 1. Conservative Nucleotide Sequences in Influenza a Virus Genomes Revealed by Multiple Sequence Alignment [J] . Ivan Ivanov, Vera Maximova, Kiril Kirilov, Biotechnology & Biotechnological Equipment . 2011,第4期

机译：在Bluegene P超级计算机上进行的甲型流感病毒的比较基因组学研究：第1部分。通过多个序列比对揭示的甲型流感病毒基因组中的保守核苷酸序列
3. Comparative Genomic Studies of Influenza a Viruses Performed on Bluegene P Supercomputer: Part 1. Conservative Nucleotide Sequences in Influenza a Virus Genomes Revealed by Multiple Sequence Alignment [J] . Ivan Ivanov, Vera Maximova, Kiril Kirilov, Biotechnology & Biotechnological Equipment . 2011,第4期

机译：在Bluegene P超级计算机上进行的甲型流感病毒的比较基因组学研究：第1部分。通过多个序列比对揭示的甲型流感病毒基因组中的保守核苷酸序列
4. Indexing genomic sequences on the IBM Blue Gene [C] . Amol Ghoting, Konstantin Makarychev Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis . 2009

机译：在IBM Blue Gene上索引基因组序列
5. CHARACTERIZATION OF A MURINE IMMEDIATE EARLY GENE, EGR-1: ITS CDNA SEQUENCE, GENOMIC ORGANIZATION AND SEQUENCE, GENE PRODUCT AND REGULATION OF GENE EXPRESSION [D] . CAO, XINMIN. 1989

机译：鼠中早期基因EGR-1的特性：其CDNA序列，基因组学和序列，基因产物和基因表达调控
6. Whole genomic sequence analysis of Bacillus infantis: defining the genetic blueprint of strain NRRL B-14911 an emerging cardiopathogenic microbe [O] . Chandirasegaran Massilamany, Akram Mohammed, John Dustin Loy, 2016

机译：婴儿芽孢杆菌的全基因组序列分析：确定NRRL B-14911菌株的基因蓝图NRRL B-14911是一种新兴的心源性微生物
7. Indexing genomic sequences on the IBM Blue Gene [O] . Amol Ghoting, Konstantin Makarychev 2009

机译：IBM Blue基因上的索引基因组序列

Indexing Genomic Sequences on the IBM Blue Gene

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅