【24h】

Next-generation massively parallel short-read mapping on FPGAs

机译:FPGA上的下一代大规模并行短读映射

获取原文

摘要

The mapping of DNA sequences to huge genome databases is an essential analysis task in modern molecular biology. Having linearized reference genomes available, the alignment of short DNA reads obtained from the sequencing of an individual genome against such a database provides a powerful diagnostic and analysis tool. In essence, this task amounts to a simple string search tolerating a certain number of mismatches to account for the diversity of individuals. The complexity of this process arises from the sheer size of the reference genome. It is further amplified by current next-generation sequencing technologies, which produce a huge number of increasingly short reads. These short reads hurt established alignment heuristics like BLAST severely. This paper proposes an FPGA-based custom computation, which performs the alignment of short DNA reads in a timely manner by the use of tremendous concurrency for reasonable costs. The special measures to achieve an extremely efficient and compact mapping of the computation to a Xilinx FPGA architecture are described. The presented approach also surpasses all software heuristics in the quality of its results. It guarantees to find all alignment locations of a read in the database while also allowing a freely adjustable character mismatch threshold. On the contrary, advanced fast alignment heuristics like Bowtie and Maq can only tolerate small mismatch maximums with a quick deterioration of the probability to detect existing valid alignments. The performance comparison with these widely used software tools also demonstrates that the proposed FPGA computation achieves its guaranteed exact results in very competitive time.
机译:将DNA序列映射到庞大的基因组数据库是现代分子生物学中必不可少的分析任务。有了线性参考基因组,从单个基因组测序中获得的短DNA读数与此类数据库的比对可提供强大的诊断和分析工具。从本质上讲,此任务相当于一个简单的字符串搜索,可以容忍一定数量的不匹配,以说明个体的多样性。该过程的复杂性来自参考基因组的绝对大小。当前的下一代测序技术进一步放大了该技术,该技术产生了大量越来越短的读数。这些短读严重损害了已建立的比对启发法,如BLAST。本文提出了一种基于FPGA的自定义计算,该计算通过使用大量并发以合理的成本及时执行短DNA读取的比对。描述了实现计算到Xilinx FPGA架构的极其高效和紧凑映射的特殊措施。所提出的方法在结果质量上也超过了所有软件启发式方法。它保证在数据库中找到读取的所有对齐位置,同时还允许自由调整字符不匹配阈值。相反,像Bowtie和Maq这样的高级快速比对启发法只能容忍较小的不匹配最大值,同时会迅速降低检测现有有效比对的概率。与这些广泛使用的软件工具进行的性能比较还表明,所建议的FPGA计算可在极具竞争力的时间内实现其保证的准确结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号