首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors
【24h】

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors

机译:Aalign:基于X86的多核和许多核心处理器的成对序列对齐的SIMD框架

获取原文

摘要

Pairwise sequence alignment algorithms, e.g., Smith-Waterman and Needleman-Wunsch, with adjustable gap penalty systems are widely used in bioinformatics. The strong data dependencies in these algorithms, however, prevents compilers from effectively auto-vectorizing them. When programmers manually vectorize them on multi-and many-core processors, two vectorizing strategies are usually considered, both of which initially ignore data dependencies and then appropriately correct in a subsequent stage: (1) iterate, which vectorizes and then compensates the scoring results with multiple rounds of corrections and (2) scan, which vectorizes and then corrects the scoring results primarily via one round of parallel scan. However, manually writing such vectorizing code efficiently is non-trivial, even for experts, and the code may not be portable across ISAs. In addition, even highly vectorized and optimized codes may not achieve optimal performance because selecting the best vectorizing strategy depends on the algorithms, configurations (gap systems), and input sequences. Therefore, we propose a framework called AAlign to automatically vectorize pairwise sequence alignment algorithms across ISAs. AAlign ingests a sequential code (which follows our generalized paradigm for pairwise sequence alignment) and automatically generates efficient vector code for iterate and scan. To reap the benefits of both vectorization strategies, we propose a hybrid mechanism where AAlign automatically selects the best vectorizing strategy at runtime no matter which algorithms, configurations, and input sequences are specified. On Intel Haswell and MIC, the generated codes for Smith-Waterman and Needleman-Wunsch achieve up to a 26-fold speedup over their sequential counterparts. Compared to the highly optimized and multi-threaded sequence alignment tools, e.g., SWPS3 and SWAPHI, our codes can deliver up to 2.5-fold and 1.6-fold speedups, respectively.
机译:双序列比对算法,例如,史密斯 - 沃特曼和EMBOSS软件包,具有可调节的间隙罚系统被广泛应用于生物信息学。在这些算法的强大的数据依赖关系,但是,防止从编译器有效地自动向量化它们。当程序员手动矢量化它们在多和众核处理器,双向量化策略通常被认为是,这两者最初忽略数据依赖关系,然后在随后的阶段适当地校正:(1)迭代,其向量化,然后补偿的评分结果用多轮校正的和(2)的扫描,其向量化,然后主要通过一个圆形并行扫描的校正评分结果。然而,手动高效编写这样向量化的代码是不平凡的,即使是专家,并且代码可能无法跨越的ISA便携。另外,即使高度向量化和因为选择最佳的向量化策略依赖于算法,配置(间隙系统)进行了优化的代码可能达不到最佳的性能,和输入序列。因此,我们提出了一个名为AAlign跨越ISA的自动向量化双序列比对算法框架。 AAlign摄取顺序码(下面我们对双序列比对广义范例),并自动对迭代和扫描生成高效的矢量代码。为了获得这两个量化策略的好处,我们提出了一个混合机制,其中AAlign自动选择在运行时最好向量化策略,无论指定哪些算法,配置和输入序列。在Intel的Haswell和MIC,史密斯-Waterman算法和EMBOSS软件包所产生的代码实现高达对其顺序对应一个26倍的加速。相较于高度优化,多线程序列比对工具,例如,SWPS3和SWAPHI,我们的代码最多可以分别输送到2.5倍和1.6倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号