Efficient SIMD Code Generation for Irregular Kernels

机译：针对不规则内核的高效SIMD代码生成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Array indirection causes several challenges for compilers to utilize single instruction, multiple data (SIMD) instructions. Disjoint memory references, arbitrarily misaligned memory references, and dependence cycles in loops are main challenges to handle for SIMD compilers. Due to those challenges, existing SIMD compilers have excluded loops with array indirection from their candidate loops for SIMD vectorization. However, addressing those challenges is inevitable, since many important compute-intensive applications extensively use array indirection to reduce memory and computation requirements. In this work, we propose a method to generate efficient SIMD code for loops containing indirected memory references. We extract both inter- and intra-iteration parallelism, taking data reorganization overhead into consideration. We also optimally place data reorganization code in order to amortize the reorganization overhead through the performance gain of SIMD vectorization. Experiments on four array indirection kernels, which are extracted from real-world scientific applications, show that our proposed method effectively generates SIMD code for irregular kernels with array indirection. Compared to the existing SIMD vectorization methods, our proposed method significantly improves the performance of irregular kernels by 91%, on average.

机译：数组间接对编译器利用单个指令，多个数据（SIMD）指令造成了一些挑战。不相交的内存引用，任意未对齐的内存引用以及循环中的依赖周期是SIMD编译器要处理的主要挑战。由于这些挑战，现有的SIMD编译器已从其候选循环中排除了具有数组间接寻址的循环，以进行SIMD向量化。但是，解决这些挑战是不可避免的，因为许多重要的计算密集型应用程序广泛使用数组间接寻址来减少内存和计算需求。在这项工作中，我们提出了一种为包含间接内存引用的循环生成高效SIMD代码的方法。考虑到数据重组开销，我们提取了迭代间和迭代内并行性。我们还优化放置数据重组代码，以通过提高SIMD向量化的性能来分摊重组开销。从现实世界的科学应用中提取的四个数组间接内核的实验表明，我们提出的方法可以有效地为带有数组间接的不规则内核生成SIMD代码。与现有的SIMD矢量化方法相比，我们提出的方法平均将不规则核的性能平均提高了91％。

著录项

来源
《ACM SIGPLAN symposium on principles and practice of parallel programming》|2012年|55-64|共10页
会议地点 New Orleans LA(US)
作者
Seonggun Kim; Hwansoo Han;
展开▼
作者单位

RP Core Group Samsung Advanced Institute of Technology Yongin 446-712 Korea;

School of Information and Communication Engineering Sungkyunkwan University. Suwon 440-746 Korea;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
DFG-based vectorization; Irregular kernels; SIMD processors;

机译：基于DFG的矢量化；不规则的内核； SIMD处理器;

相似文献

外文文献
中文文献
专利

1. Efficient SIMD Code Generation for Irregular Kernels [J] . Seonggun Kim, Hwansoo Han ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2012,第8期

机译：针对不规则内核的高效SIMD代码生成
2. EFFICIENT TREE CODES ON SIMD COMPUTER ARCHITECTURES [J] . Olson KM. Computer physics communications . 1996,第3期

机译：模拟计算机体系结构上的有效树码
3. Efficient and Effective Visual Codebook Generation Using Additive Kernels [J] . Wu Jianxin, Tan Wei-Chian, Rehg James M. Journal of machine learning research . 2011,第Nov期

机译：使用加性内核的高效有效的可视化密码本生成
4. Efficient SIMD Code Generation for Irregular Kernels [C] . Seonggun Kim, Hwansoo Han ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming . 2012

机译：用于不规则内核的高效SIMD代码生成
5. New Algorithms for High-Throughput Decoding with Low-Density Parity-Check Codes using Fixed-Point SIMD Processors. [D] . Kennedy, JaWone Anthony. 2012

机译：使用定点SIMD处理器的低密度奇偶校验码高通量解码的新算法。
6. Rapid and Efficient Generation of Functional Motor Neurons From Human Pluripotent Stem Cells Using Gene Delivered Transcription Factor Codes [O] . Mark E Hester, Matthew J Murtha, SungWon Song, 2011

机译：使用基因传递的转录因子代码从人多能干细胞快速有效地产生功能性运动神经元
7. Efficient SIMD Code Generation for Runtime Alignment and Length Conversion [O] . Peng Wu, Alexandre E. Eichenberger, Amy Wang 2005

机译：用于运行时对齐和长度转换的高效sImD代码生成

Efficient SIMD Code Generation for Irregular Kernels

摘要

著录项

相似文献

相关主题

期刊订阅