首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment
【24h】

PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment

机译:PUNAS:并行的无间隙比对功能的种子验证算法,用于下一代测序读取比对

获取原文

摘要

The progress of next-generation sequencing has a major impact on medical and genomic research. This technology can now produce billions of short DNA fragments (reads) in a single run. One of the most demanding computational problems used by almost every sequencing pipeline is short-read alignment; i.e. determining where each fragment originated from in the original genome. Most current solutions are based on a seed-and-extend approach, where promising candidate regions (seeds) are first identified and subsequently extended in order to verify whether a full high-scoring alignment actually exists in the vicinity of each seed. Seed verification is the main bottleneck in many state-of-the-art aligners and thus finding fast solutions is of high importance. We present a parallel un gapped-alignment-featured seed verification (PUNAS) algorithm, a fast filter for effectively removing the majority of false positive seeds, thus significantly accelerating the short-read alignment process. PUNAS is based on bit-parallelism and takes advantage of SIMD vector units of modern microprocessors. Our implementation employs a vectorize-and-scale approach supporting multi-core CPUs and many-core Knights Landing (KNL)-based Xeon Phi processors. Performance evaluation reveals that PUNAS is over three orders-of-magnitude faster than seed verification with the Smith-Waterman algorithm and around one order-of-magnitude faster than seed verification with the banded version of Myers bit-vector algorithm. Using a single thread it achieves a speedup of up to 7.3, 27.1, and 11.6 compared to the shifted Hamming distance filter on a SSE, AVX2, and AVX-512 based CPU/KNL, respectively. The speed of our framework further scales almost linearly with the number of cores. PUNAS is open-source software available at https://github.com/Xu-Kai/PUNASfilter.
机译:下一代测序的进展对医学和基因组研究具有重大影响。现在,这项技术可以一次运行产生数十亿个短DNA片段(读段)。几乎每个测序流水线使用的最苛刻的计算问题之一是短读比对。即确定每个片段起源于原始基因组的位置。当前大多数解决方案都基于种子和扩展方法,首先确定有希望的候选区域(种子),然后对其进行扩展,以验证每个种子附近是否确实存在完整的高分比对。种子验证是许多最先进的对准器的主要瓶颈,因此,找到快速解决方案非常重要。我们提出了一种并行的无间隙比对功能的种子验证(PUNAS)算法,这是一种用于有效去除大多数假阳性种子的快速过滤器,从而显着加速了短读比对过程。 PUNAS基于位并行性,并利用了现代微处理器的SIMD向量单元。我们的实现采用矢量化和缩放方法,支持多核CPU和基于多核Knights Landing(KNL)的至强融核处理器。性能评估显示,PUNAS比使用Smith-Waterman算法的种子验证快三个数量级,比使用Myers位向量算法的带区版本的种子验证快大约一个数量级。与基于SSE,AVX2和AVX-512的CPU / KNL上的移位汉明距离滤波器相比,使用单线程可实现高达7.3、27.1和11.6的加速。我们框架的速度几乎与内核数成线性比例关系。 PUNAS是可从https://github.com/Xu-Kai/PUNASfilter获取的开源软件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号