Vector Folding: Improving Stencil Performance via Multi-dimensional SIMD-vector Representation

机译：矢量折叠：通过多维SIMD-矢量表示来改善模板性能

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stencil computation is an important class of algorithms used in a large variety of scientific-simulation applications. Modern CPUs are employing increasingly longer SIMD vector registers and operations to improve computational throughput. However, the traditional use of vectors to contain sequential data elements along one dimension is not always the most efficient representation, especially in the multicore and hyper-threaded context where caches are shared among many simultaneous compute streams. This paper presents a general technique for representing data in vectors for 2D and 3D stencils. This method reduces the number of memory accesses required by storing a small multi-dimensional block of data in each vector compared to the single dimension in the traditional approach. Experiments on an Intel Xeon Phi Coprocessor show performance speedups over traditional vectors ranging from 1.2x to 2.7x, depending on the problem size and stencil type. This technique is independent of and complementary to a variety of existing stencil-computation tuning algorithms such as cache blocking, loop tiling, and wavefront parallelization.

机译：模板计算是在各种科学模拟应用程序中使用的重要算法类别。现代CPU使用越来越长的SIMD向量寄存器和操作来提高计算吞吐量。但是，传统上使用向量沿一维包含顺序数据元素并不总是最有效的表示方式，尤其是在多核和超线程环境中，缓存在许多同时的计算流之间共享。本文提出了一种通用技术，用于表示2D和3D模板向量中的数据。与传统方法中的单一维度相比，此方法通过在每个向量中存储一个小的多维数据块来减少所需的内存访问次数。在英特尔至强融核协处理器上进行的实验表明，根据问题的大小和模具类型，性能比传统矢量提高了1.2倍至2.7倍。此技术独立于并互补于各种现有的模板计算调整算法，例如缓存阻止，循环平铺和波前并行化。

著录项

来源
《2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, 2015 IEEE 12th International Conference on Embedded Software and Systems》|2015年|865-870|共6页
会议地点 New York NY(US)
作者
Yount Charles;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
data structures; multiprocessing systems; parallel processing; CPU; Intel Xeon Phi Coprocessor; hyper-threaded context; memory access; multidimensional SIMD-vector representation; multidimensional block; scientific-simulation application; sequential data element; stencil computation; stencil performance; vector folding; Jacobian matrices; Layout; Memory management; Registers; Shape; Three-dimensional displays; Intel; SIMD; Xeon Phi; high-performance computing; stencil; vector folding; vectorization;

机译：数据结构;多处理系统;并行处理; CPU; Intel Xeon Phi协处理器;超线程上下文;内存访问;多维SIMD矢量表示;多维块;科学应用;顺序数据元素;模版计算;模版性能;矢量折叠; Jacobian矩阵;布局;内存管理;寄存器;形状;三维显示; Intel; SIMD; Xeon Phi;高性能计算;模板;矢量折叠;矢量化;;
入库时间 2022-08-26 13:53:55

相似文献

外文文献
中文文献
专利

1. A wavelet-based multi-dimensional temporal recurrent neural network for stencil printing performance prediction [J] . Haifeng Wang, Hongya Lu, Shrouq M. Alelaumi, Robotics and Computer-Integrated Manufacturing . 2021,第Octa期

机译：用于模版印刷性能预测的基于小波的多维时间复制神经网络
2. Support Vector Machine with K-fold Validation to Improve the Industry’s Sustainability Performance Classification [J] . Muhammad Asrol, Petir Papilo, Fergyanto E Gunawan Procedia Computer Science . 2021,第1期

机译：支持k折验证的向量机，以提高行业的可持续发展性能分类
3. Vectorial representation of single- and multi-domain protein folds [J] . Teichert F, Porto M The European physical journal, B. Condensed matter physics . 2006,第1期

机译：单域和多域蛋白质折叠的矢量表示
4. Vector Folding: Improving Stencil Performance via Multi-dimensional SIMD-vector Representation [C] . Yount Charles IEEE International Conference on High Performance Computing and Communications . 2015

机译：矢量折叠：通过多维SIMD矢量表示提高模板性能
5. Multi-dimensional performance requires multi-dimensional predictors: Predicting complex job performance using cognitive ability, personality and emotional intelligence assessment instruments as combinatorial predictors. [D] . Kostman, J. T. 2004

机译：多维绩效需要多维预测指标：使用认知能力，个性和情商评估工具作为组合预测指标来预测复杂的工作绩效。
6. Improving rolling bearing online fault diagnostic performance based on multi-dimensional characteristics [O] . Chuanlei Yang, Hechun Wang, Zhanbin Gao, 2018

机译：基于多维特征的滚动轴承在线故障诊断性能的提高
7. StVEC: A Vector Instruction Extension for High Performance Stencil Computation [O] . Naser Sedaghati, Renji Thomas, Louis-noël Pouchet, 2015

机译：stVEC：用于高性能模板计算的矢量指令扩展

Vector Folding: Improving Stencil Performance via Multi-dimensional SIMD-vector Representation

摘要

著录项

相似文献

相关主题

期刊订阅