首页> 外文会议>2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops amp; PhD Forum >Evaluation of GPU-based Seed Generation for Computational Genomics Using Burrows-Wheeler Transform
【24h】

Evaluation of GPU-based Seed Generation for Computational Genomics Using Burrows-Wheeler Transform

机译:使用Burrows-Wheeler变换评估基于GPU的计算基因组学种子生成

获取原文
获取原文并翻译 | 示例

摘要

Unprecedented production of short reads from the new high-throughput sequencers has posed challenges to align short reads to reference genomes with high sensitivity and high speed. Many CPU-based short read aligners have been developed to address this challenge. Among them, one popular approach is the seed-and-extend heuristic. For this heuristic, the first and foremost step is to generate seeds between the input reads and the reference genome, where hash tables are the most frequently used data structure. However, hash tables are memory-consuming, making it not well-suited to memory-stringent many-core architectures, like GPUs, even though they usually have a nearly constant query time complexity. The Burrows-Wheeler transform (BWT) provides a memory-efficient alternative, which has the drawback of having query time complexity as a function of query length. In this paper, we investigate GPU-based fixed-length seed generation for computational genomics based on the BWT and Ferragina Manzini (FM)-index, where k-mers from the reads are searched against a reference genome (indexed using BWT) to find k-mer matches (i.e. seeds). In addition to exact matches, mismatches are allowed at any position within a seed, different from spaced seeds that only allow mismatches at predefined positions. By evaluating the relative performance of our GPU version to an equivalent CPU version, we intend to provide some useful guidance for the development of GPU-based seed generators for aligners based on the seed-and-extend paradigm.
机译:新型高通量测序仪前所未有地产生短读段,这给以高灵敏度和高速度将短读段与参考基因组比对提出了挑战。已经开发了许多基于CPU的短读对齐器来应对这一挑战。其中,一种流行的方法是种子扩展算法。为此,第一步也是最重要的一步是在输入读取和参考基因组之间生成种子,其中哈希表是最常用的数据结构。但是,哈希表消耗大量内存,因此,即使哈希表通常具有几乎恒定的查询时间复杂度,也不适合像GPU这样的内存密集型多核体系结构。 Burrows-Wheeler变换(BWT)提供了一种内存有效的替代方法,其缺点是查询时间复杂度是查询长度的函数。在本文中,我们研究了基于BWT和Ferragina Manzini(FM)索引的基于GPU的固定长度种子生成,用于计算基因组学,其中针对参考基因组(使用BWT进行索引)搜索了读段的k-mers k聚体匹配(即种子)。除了精确匹配外,种子内任何位置都允许不匹配,这与仅允许预定义位置不匹配的间隔种子不同。通过评估我们的GPU版本与等效CPU版本的相对性能,我们打算为基于种子和扩展范例的对齐器的基于GPU的种子生成器的开发提供一些有用的指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号