...
首页> 外文期刊>BMC Bioinformatics >libgapmis : extending short-read alignments
【24h】

libgapmis : extending short-read alignments

机译:libgapmis:扩展短读比对

获取原文

摘要

Background A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of mismatches in the alignment; however, their ability to allow for gaps varies greatly, with many performing poorly or not allowing them at all. The seed-and-extend strategy is applied in most short-read alignment programmes. After aligning a substring of the reference sequence against the high-quality prefix of a short read-- the seed --an important problem is to find the best possible alignment between a substring of the reference sequence succeeding and the remaining suffix of low quality of the read-- extend . The fact that the reads are rather short and that the gap occurrence frequency observed in various studies is rather low suggest that aligning (parts of) those reads with a single gap is in fact desirable. Results In this article, we present libgapmis , a library for extending pairwise short-read alignments. Apart from the standard CPU version, it includes ultrafast SSE- and GPU-based implementations. libgapmis is based on an algorithm computing a modified version of the traditional dynamic-programming matrix for sequence alignment. Extensive experimental results demonstrate that the functions of the CPU version provided in this library accelerate the computations by a factor of 20 compared to other programmes. The analogous SSE- and GPU-based implementations accelerate the computations by a factor of 6 and 11, respectively, compared to the CPU version. The library also provides the user the flexibility to split the read into fragments, based on the observed gap occurrence frequency and the length of the read, thereby allowing for a variable, but bounded, number of gaps in the alignment. Conclusions We present libgapmis , a library for extending pairwise short-read alignments. We show that libgapmis is better-suited and more efficient than existing algorithms for this task. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any short-read alignment pipeline. The open-source code of libgapmis is available at http://?www.?exelixis-lab.?org/?gapmis .
机译:背景技术最近已经发布了各种各样的短读比对程序,以解决将数百万个短读图映射到参考基因组的问题,重点是该程序的不同方面,例如时间和存储效率,灵敏度和准确性。这些工具可以使对齐中出现少量不匹配;但是,它们允许差距的能力差异很大,许多公司表现不佳或根本不允许。种子扩展策略适用于大多数短读比对程序。在将参考序列的子串与短读的高质量前缀(种子)对齐后,一个重要的问题是找到成功的参考序列子串与其余低质量后缀之间的最佳对齐方式read--extend。读数相当短并且在各种研究中观察到的空位发生频率相当低的事实表明,实际上需要将这些读数(的一部分)与单个空位对齐。结果在本文中,我们介绍了libgapmis,这是一个用于扩展成对的短读比对的库。除了标准的CPU版本,它还包括基于SSE和GPU的超快速实现。 libgapmis基于一种算法,该算法计算传统动态编程矩阵的修改版本以进行序列比对。大量的实验结果表明,与其他程序相比,该库中提供的CPU版本的功能将计算速度提高了20倍。与CPU版本相比,基于SSE和GPU的类似实现分别将计算速度提高了6倍和11倍。该库还为用户提供了灵活性,可根据观察到的空位出现频率和读取的长度将读取片段分为多个片段,从而在比对中获得可变但有界的空位数量。结论我们介绍了libgapmis,一个用于扩展成对的短读比对的库。我们显示libgapmis比用于此任务的现有算法更适合且效率更高。所提供的功能可以无缝集成到任何短读对齐管道中这一事实突显了我们所做贡献的重要性。 libgapmis的开源代码可从http://?www。?exelixis-lab。?org /?gapmis获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号