AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors

机译：AAlign：一种用于基于x86的多核和多核处理器上的成对序列比对的SIMD框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Pairwise sequence alignment algorithms, e.g., Smith-Waterman and Needleman-Wunsch, with adjustable gap penalty systems are widely used in bioinformatics. The strong data dependencies in these algorithms, however, prevents compilers from effectively auto-vectorizing them. When programmers manually vectorize them on multi-and many-core processors, two vectorizing strategies are usually considered, both of which initially ignore data dependencies and then appropriately correct in a subsequent stage: (1) iterate, which vectorizes and then compensates the scoring results with multiple rounds of corrections and (2) scan, which vectorizes and then corrects the scoring results primarily via one round of parallel scan. However, manually writing such vectorizing code efficiently is non-trivial, even for experts, and the code may not be portable across ISAs. In addition, even highly vectorized and optimized codes may not achieve optimal performance because selecting the best vectorizing strategy depends on the algorithms, configurations (gap systems), and input sequences. Therefore, we propose a framework called AAlign to automatically vectorize pairwise sequence alignment algorithms across ISAs. AAlign ingests a sequential code (which follows our generalized paradigm for pairwise sequence alignment) and automatically generates efficient vector code for iterate and scan. To reap the benefits of both vectorization strategies, we propose a hybrid mechanism where AAlign automatically selects the best vectorizing strategy at runtime no matter which algorithms, configurations, and input sequences are specified. On Intel Haswell and MIC, the generated codes for Smith-Waterman and Needleman-Wunsch achieve up to a 26-fold speedup over their sequential counterparts. Compared to the highly optimized and multi-threaded sequence alignment tools, e.g., SWPS3 and SWAPHI, our codes can deliver up to 2.5-fold and 1.6-fold speedups, respectively.

机译：成对的序列比对算法，例如Smith-Waterman和Needleman-Wunsch，具有可调的空位罚分系统，被广泛用于生物信息学中。但是，这些算法中强大的数据依存关系使编译器无法有效地对其进行矢量化处理。当程序员在多核和多核处理器上手动向量化它们时，通常会考虑两种向量化策略，这两种策略最初都会忽略数据依赖性，然后在后续阶段进行适当校正：（1）进行迭代，先向量化然后补偿评分结果。进行多轮校正和（2）扫描，这些向量将矢量化，然后主要通过一轮并行扫描校正评分结果。但是，即使对于专家而言，有效地手动编写这样的矢量化代码也不是一件容易的事，并且这些代码可能无法跨ISA移植。另外，即使高度矢量化和优化的代码也可能无法达到最佳性能，因为选择最佳矢量化策略取决于算法，配置（间隙系统）和输入序列。因此，我们提出了一个称为AAlign的框架，可以跨ISA自动矢量化成对序列比对算法。 AAlign提取一个顺序代码（遵循我们的成对序列比对通用范式），并自动生成有效的矢量代码以进行迭代和扫描。为了获得两种矢量化策略的好处，我们提出了一种混合机制，无论指定了哪种算法，配置和输入序列，AAlign都会在运行时自动选择最佳的矢量化策略。在Intel Haswell和MIC上，为Smith-Waterman和Needleman-Wunsch生成的代码将其顺序代码的速度提高了26倍。与高度优化的多线程序列比对工具（例如SWPS3和SWAPHI）相比，我们的代码可以分别提供高达2.5倍和1.6倍的加速。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2016年|780-789|共10页
会议地点
作者
Kaixi Hou; Hao Wang; Wu-Chun Feng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Microwave integrated circuits; Program processors; Runtime; Writing; Central Processing Unit; Bioinformatics; Complexity theory;

机译：微波集成电路;程序处理器;运行时;编写;中央处理器;生物信息学;复杂性理论;

相似文献

外文文献
中文文献
专利

1. Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture [J] . David Diaz, Francisco Jose Esteban, Pilar Hernandez, Parallel Computing . 2011,第4a5期

机译：用于多核架构的生物信息学成对序列比对算法的并行化和优化
2. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments [J] . Jeff Daily BMC Bioinformatics . 2016,第1期

机译：Parasail：SIMD C库，用于全局，半全局和局部成对序列比对
3. Next-generation bioinformatics: using many-core processor architecture to develop a web service for sequence alignment [J] . Galvez, Sergio, Diaz, David, Hernandez, Pilar, Bioinformatics . 2010,第5期

机译：下一代生物信息学：使用多核处理器体系结构开发用于序列比对的Web服务
4. AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors [C] . Kaixi Hou, Hao Wang, Wu-Chun Feng IEEE International Parallel and Distributed Processing Symposium . 2016

机译：Aalign：基于X86的多核和许多核心处理器的成对序列对齐的SIMD框架
5. Generic C++ implementations of pairwise sequence alignment: Instantiation for global alignment. [D] . Zhang, Yan. 2003

机译：成对序列比对的通用C ++实现：全局比对的实例化。
6. Parasail: SIMD C library for global semi-global and local pairwise sequence alignments [O] . Jeff Daily 2016

机译：Parasail：用于全局半全局和局部成对序列比对的SIMD C库
7. Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture [O] . Díaz David, Esteban Francisco J., Hernández Molina Pilar, 2014

机译：用于多核架构的生物信息学成对序列比对算法的并行化和优化

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors

摘要

著录项

相似文献

相关主题

期刊订阅