AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors

机译：Aalign：基于X86的多核和许多核心处理器的成对序列对齐的SIMD框架

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Pairwise sequence alignment algorithms, e.g., Smith-Waterman and Needleman-Wunsch, with adjustable gap penalty systems are widely used in bioinformatics. The strong data dependencies in these algorithms, however, prevents compilers from effectively auto-vectorizing them. When programmers manually vectorize them on multi-and many-core processors, two vectorizing strategies are usually considered, both of which initially ignore data dependencies and then appropriately correct in a subsequent stage: (1) iterate, which vectorizes and then compensates the scoring results with multiple rounds of corrections and (2) scan, which vectorizes and then corrects the scoring results primarily via one round of parallel scan. However, manually writing such vectorizing code efficiently is non-trivial, even for experts, and the code may not be portable across ISAs. In addition, even highly vectorized and optimized codes may not achieve optimal performance because selecting the best vectorizing strategy depends on the algorithms, configurations (gap systems), and input sequences. Therefore, we propose a framework called AAlign to automatically vectorize pairwise sequence alignment algorithms across ISAs. AAlign ingests a sequential code (which follows our generalized paradigm for pairwise sequence alignment) and automatically generates efficient vector code for iterate and scan. To reap the benefits of both vectorization strategies, we propose a hybrid mechanism where AAlign automatically selects the best vectorizing strategy at runtime no matter which algorithms, configurations, and input sequences are specified. On Intel Haswell and MIC, the generated codes for Smith-Waterman and Needleman-Wunsch achieve up to a 26-fold speedup over their sequential counterparts. Compared to the highly optimized and multi-threaded sequence alignment tools, e.g., SWPS3 and SWAPHI, our codes can deliver up to 2.5-fold and 1.6-fold speedups, respectively.

机译：双序列比对算法，例如，史密斯 - 沃特曼和EMBOSS软件包，具有可调节的间隙罚系统被广泛应用于生物信息学。在这些算法的强大的数据依赖关系，但是，防止从编译器有效地自动向量化它们。当程序员手动矢量化它们在多和众核处理器，双向量化策略通常被认为是，这两者最初忽略数据依赖关系，然后在随后的阶段适当地校正：（1）迭代，其向量化，然后补偿的评分结果用多轮校正的和（2）的扫描，其向量化，然后主要通过一个圆形并行扫描的校正评分结果。然而，手动高效编写这样向量化的代码是不平凡的，即使是专家，并且代码可能无法跨越的ISA便携。另外，即使高度向量化和因为选择最佳的向量化策略依赖于算法，配置（间隙系统）进行了优化的代码可能达不到最佳的性能，和输入序列。因此，我们提出了一个名为AAlign跨越ISA的自动向量化双序列比对算法框架。 AAlign摄取顺序码（下面我们对双序列比对广义范例），并自动对迭代和扫描生成高效的矢量代码。为了获得这两个量化策略的好处，我们提出了一个混合机制，其中AAlign自动选择在运行时最好向量化策略，无论指定哪些算法，配置和输入序列。在Intel的Haswell和MIC，史密斯-Waterman算法和EMBOSS软件包所产生的代码实现高达对其顺序对应一个26倍的加速。相较于高度优化，多线程序列比对工具，例如，SWPS3和SWAPHI，我们的代码最多可以分别输送到2.5倍和1.6倍的加速。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2016年|575p|共10页
会议地点
作者
Kaixi Hou; Hao Wang; Wu-Chun Feng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13-53;
关键词
Microwave integrated circuits; Program processors; Runtime; Writing; Central Processing Unit; Bioinformatics; Complexity theory;

机译：微波集成电路;程序处理器;运行时;写作;中央处理单元;生物信息学;复杂性理论;

相似文献

外文文献
中文文献
专利

1. Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture [J] . David Diaz, Francisco Jose Esteban, Pilar Hernandez, Parallel Computing . 2011,第4a5期

机译：用于多核架构的生物信息学成对序列比对算法的并行化和优化
2. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments [J] . Jeff Daily BMC Bioinformatics . 2016,第1期

机译：Parasail：SIMD C库，用于全局，半全局和局部成对序列比对
3. Next-generation bioinformatics: using many-core processor architecture to develop a web service for sequence alignment [J] . Galvez, Sergio, Diaz, David, Hernandez, Pilar, Bioinformatics . 2010,第5期

机译：下一代生物信息学：使用多核处理器体系结构开发用于序列比对的Web服务
4. AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors [C] . Kaixi Hou, Hao Wang, Wu-Chun Feng IEEE International Parallel and Distributed Processing Symposium . 2016

机译：AAlign：一种用于基于x86的多核和多核处理器上的成对序列比对的SIMD框架
5. Generic C++ implementations of pairwise sequence alignment: Instantiation for global alignment. [D] . Zhang, Yan. 2003

机译：成对序列比对的通用C ++实现：全局比对的实例化。
6. Parasail: SIMD C library for global semi-global and local pairwise sequence alignments [O] . Jeff Daily 2016

机译：Parasail：用于全局半全局和局部成对序列比对的SIMD C库
7. Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture [O] . Díaz David, Esteban Francisco J., Hernández Molina Pilar, 2014

机译：用于多核架构的生物信息学成对序列比对算法的并行化和优化

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅