Parallel Computation of the Burrows-Wheeler Transform of Short Reads Using Prefix Parallelism

Kimura Kouichi; Koike Asako

首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Parallel Computation of the Burrows-Wheeler Transform of Short Reads Using Prefix Parallelism

【24h】

Parallel Computation of the Burrows-Wheeler Transform of Short Reads Using Prefix Parallelism

机译：使用前缀并行度对短读的Burrows-Wheeler变换进行并行计算

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Burrows-Wheeler transform (BWT) of short-read data has unexplored potential utilities, such as for efficient and sensitive variation analysis against multiple reference genome sequences, because it does not depend on any particular reference genome sequence, unlike conventional mapping-based methods. However, since the amount of read data is generally much larger than the size of the reference sequence, computation of the BWT of reads is not easy, and this hampers development of potential applications. For the alleviation of this problem, a new method of computing the BWT of reads in parallel is proposed. The BWT, corresponding to a sorted list of suffixes of reads, is constructed incrementally by successively including longer and longer suffixes. The working data is divided into more than 10,000 "blocks" corresponding to sublists of suffixes with the same prefixes. Thousands of groups of blocks can be processed in parallel while making exclusive writes and concurrent reads into a shared memory. Reads and writes are basically sequential, and the read concurrency is limited to two. Thus, a fine-grained parallelism, referred to as prefix parallelism, is expected to work efficiently. The time complexity for processing n reads of length l is O(nl(2)). On actual biological DNA sequence data of about 100 Gbp with a read length of 100 bp (base pairs), a tentative implementation of the proposed method took less than an hour on a single-node computer; i.e., it was about three times faster than one of the fastest programs developed so far.

机译：短读取数据的Burrows-Wheeler变换（BWT）具有未开发的潜在效用，例如用于针对多个参考基因组序列进行高效而敏感的变异分析，因为它不依赖于任何特定的参考基因组序列，这与传统的基于映射的方法不同。但是，由于读取的数据量通常远大于参考序列的大小，因此读取的BWT的计算并不容易，并且这阻碍了潜在应用程序的开发。为了缓解这个问题，提出了一种计算并行读取的BWT的新方法。通过依次包含越来越长的后缀来递增地构造与读取后缀的排序列表相对应的BWT。工作数据被分为超过10,000个“块”，对应于具有相同前缀的后缀子列表。数以千计的块组可以并行处理，同时对共享内存进行独占写入和并发读取。读取和写入基本上是顺序的，并且读取并发限制为两个。因此，期望细粒度的并行性（称为前缀并行性）可以有效地工作。处理n个长度为1的读段的时间复杂度为O（nl（2））。在大约100 Gbp的实际生物DNA序列数据和100 bp的读取长度（碱基对）的基础上，该方法的暂定实现在单节点计算机上花费了不到一个小时的时间；也就是说，它比到目前为止开发的最快的程序之一快三倍。

著录项

来源
《IEEE/ACM transactions on computational biology and bioinformatics》 |2019年第1期|3-13|共11页
作者
Kimura Kouichi; Koike Asako;
展开▼
作者单位

Hitachi Ltd, Biosyst Res Dept, Ctr Technol Innovat Healthcare, 1-280 Higashi Koigakubo, Kokubunji, Tokyo 1858601, Japan;

Hitachi Ltd, Future Investment Div, Chiyoda Ku, 1-6-6 Marunouchi, Tokyo 1008280, Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Biology and genetics; nonnumerical algorithms and problems;

机译：生物学和遗传学;非数值算法和问题;

相似文献

外文文献
中文文献
专利

1. Hardware-Acceleration of Short-Read Alignment Based on the Burrows-Wheeler Transform [J] . H. M. Waidyasooriya, M. Hariyama IEEE Transactions on Parallel and Distributed Systems . 2016,第5期

机译：基于Burrows-Wheeler变换的短读比对的硬件加速
2. Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data [J] . Kouichi Kimura, Asako Koike BMC Bioinformatics . 2015,第SUPPLEMENTa18期

机译：利用短读数据的Burrows-Wheeler变换分析基因组重排
3. Ultrafast SNP analysis using the Burrows-Wheeler transform of short-read data [J] . Kimura Kouichi, Koike Asako Bioinformatics . 2015,第10期

机译：使用短读数据的Burrows-Wheeler变换进行超快速SNP分析
4. Exploring parallelism in short sequence mapping using Burrows-Wheeler Transform [C] . Bozdag D., Hatem A., Catalyurek U.V. 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum . 2010

机译：使用Burrows-Wheeler变换探索短序列映射中的并行性
5. Highly scalable short read alignment with the Burrows-Wheeler transform and cloud computing [D] . Langmead, Benjamin Thomas 2009

机译：通过Burrows-Wheeler转换和云计算实现高度可扩展的短读对齐
6. Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data [O] . Kouichi Kimura, Asako Koike 2015

机译：利用短读数据的Burrows-Wheeler变换分析基因组重排
7. Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data [O] . Kouichi Kimura, Asako Koike 2015

机译：利用短读数据的Burrows-Wheeler变换分析基因组重排

Parallel Computation of the Burrows-Wheeler Transform of Short Reads Using Prefix Parallelism

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅