首页> 外文会议>International conference on Parallel architectures and compilation techniques >Combining analytical and empirical approaches in tuning matrix transposition
【24h】

Combining analytical and empirical approaches in tuning matrix transposition

机译:结合分析和经验方法调整矩阵换位

获取原文

摘要

Matrix transposition is an important kernel used in many applications. Even though its optimization has been the subject of many studies, an optimization procedure that targets the characteristics of current processor architectures has not been developed. In this paper, we develop an integrated optimization framework that addresses a number of issues, including tiling for the memory hierarchy, effective handling of memory misalignment, utilizing memory subsystem characteristics, and the exploitation of the parallelism provided by the vector instruction sets in current processors. A judicious combination of analytical and empirical approaches is used to determine the most appropriate optimizations. The absence of problem information until execution time is handled by generating multiple versions of the code - the best version is chosen at runtime, with assistance from minimal-overhead inspectors. The approach highlights aspects of empirical optimization that are important for similar computations with little temporal reuse. Experimental results on PowerPC G5 and Intel Pentium 4 demonstrate the effectiveness of the developed framework.
机译:Matrix rantposition是许多应用中使用的重要内核。尽管它的优化是许多研究的主题,但尚未开发针对当前处理器架构的特征的优化过程。在本文中,我们开发了一个集成的优化框架,解决了一些问题,包括用于存储层次结构的平铺,有效处理内存未对准,利用内存子系统特征,以及传染媒介指令集提供的并行性在当前处理器中提供的并行性。分析和经验方法的明智组合用于确定最合适的优化。缺少问题信息,直到执行时间通过生成多个代码来处理 - 在运行时选择最佳版本,以及来自Minimal-Overhead Ineptors的帮助。该方法突出了实证优化的各个方面,对于具有很少的时间重复使用的类似计算很重要。 PowerPC G5和Intel Pentium 4的实验结果证明了发达框架的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号