...
首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A cost-effective implementation of multilevel tiling
【24h】

A cost-effective implementation of multilevel tiling

机译:具有成本效益的多层平铺实施

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a new cost-effective algorithm to compute exact loop bounds when multilevel tiling is applied to a loop nest having affine functions as bounds (nonrectangular loop nest). Traditionally, exact loop bounds computation has not been performed because its complexity is doubly exponential on the number of loops in the multilevel tiled code and, therefore, for certain classes of loops (i.e., nonrectangular loop nests), can be extremely time consuming. Although computation of exact loop bounds is not very important when tiling only for cache levels, it is critical when tiling includes the register level. This paper presents an efficient implementation of multilevel tiling that computes exact loop bounds and has a much lower complexity than conventional techniques. To achieve this lower complexity, our technique deals simultaneously with all levels to be tiled, rather than applying tiling level by level as is usually done. For loop nests having very simple affine functions as bounds, results show that our method is between 15 and 28 times faster than conventional techniques. For loop nests caving not so simple bounds, we have measured speedups as high as 2,300. Additionally, our technique allows eliminating redundant bounds efficiently. Results show that eliminating redundant bounds in our method is between 22 and 11 times faster than in conventional techniques for typical linear algebra programs.
机译:本文介绍了一种新的具有成本效益的算法,当将多级平铺应用到以仿射函数为边界的循环嵌套(非矩形循环嵌套)时,可以计算出精确的循环边界。传统上,尚未执行精确的循环边界计算,因为其复杂度在多级平铺代码中的循环数上成倍增加,因此,对于某些类型的循环(即非矩形循环嵌套)而言,这可能会非常耗时。尽管仅在为缓存级别分块时,精确循环边界的计算不是很重要,但在分块包括寄存器级别时,至关重要。本文提出了一种高效的多层平铺实现,该计算可以计算出精确的循环边界,并且比传统技术具有更低的复杂度。为了实现这种较低的复杂性,我们的技术同时处理要平铺的所有层,而不是像通常那样逐层应用平铺。对于具有非常简单的仿射函数作为边界的循环嵌套,结果表明我们的方法比传统技术快15到28倍。对于不那么简单边界的循环嵌套,我们测得的加速比高达2300。此外,我们的技术可以有效消除冗余边界。结果表明,对于典型的线性代数程序,我们的方法消除冗余边界的速度比传统技术快22到11倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号