...
首页> 外文期刊>Bioinformatics >Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing
【24h】

Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing

机译:下一代DNA测序的基于哈希的长读序列作图算法的剖析

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Recently, a number of programs have been proposed for mapping short reads to a reference genome. Many of them are heavily optimized for short-read mapping and hence are very efficient for shorter queries, but that makes them inefficient or not applicable for reads longer than 200 bp. However, many sequencers are already generating longer reads and more are expected to follow. For long read sequence mapping, there are limited options; BLAT, SSAHA2, FANGS and BWA-SW are among the popular ones. However, resequencing and personalized medicine need much faster software to map these long sequencing reads to a reference genome to identify SNPs or rare transcripts.Results: We present AGILE (AliGnIng Long rEads), a hash table based high-throughput sequence mapping algorithm for longer 454 reads that uses diagonal multiple seed-match criteria, customized q-gram filtering and a dynamic incremental search approach among other heuristics to optimize every step of the mapping process. In our experiments, we observe that AGILE is more accurate than BLAT, and comparable to BWA-SW and SSAHA2. For practical error rates (< 5%) and read lengths (200-1000 bp), AGILE is significantly faster than BLAT, SSAHA2 and BWA-SW. Even for the other cases, AGILE is comparable to BWA-SW and several times faster than BLAT and SSAHA2.
机译:动机:最近,已经提出了许多程序来将短读图谱映射到参考基因组。它们中的许多已针对短读映射进行了优化,因此对于较短的查询非常有效,但这使它们效率低下或不适用于长于200 bp的读取。但是,许多定序器已经产生了更长的读数,预计还会有更多的序列。对于长读序列映射,选择有限。 BLAT,SSAHA2,FANGS和BWA-SW是最受欢迎的。然而,重测序和个性化医学需要更快的软件来将这些长测序读数映射到参考基因组,以识别SNP或稀有转录本。 454读取使用对角线多个种子匹配条件,自定义q-gram过滤以及动态启发式搜索方法以及其他启发式方法来优化映射过程的每个步骤。在我们的实验中,我们观察到AGILE比BLAT更准确,并且可以与BWA-SW和SSAHA2相提并论。对于实际错误率(<5%)和读取长度(200-1000 bp),AGILE明显快于BLAT,SSAHA2和BWA-SW。即使对于其他情况,AGILE仍可与BWA-SW媲美,并且比BLAT和SSAHA2快几倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号