首页> 外文期刊>ACM Transactions on Graphics >Decoupling Algorithms from Schedules for Easy Optimization of Image Processing Pipelines
【24h】

Decoupling Algorithms from Schedules for Easy Optimization of Image Processing Pipelines

机译:从计划表中解耦算法,轻松优化图像处理管道

获取原文
获取原文并翻译 | 示例

摘要

Using existing programming tools, writing high-performance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequence of conflating what computations define the algorithm, with decisions about storage and the order of computation. We refer to these latter two concerns as the schedule, including choices of tiling, fusion, recomputation vs. storage, vectorization, and parallelism. We propose a representation for feed-forward imaging pipelines that separates the algorithm from its schedule, enabling high-performance without sacrificing code clarity. This decoupling simplifies the algorithm specification: images and intermediate buffers become functions over an infinite integer domain, with no explicit storage or boundary conditions. Imaging pipelines are compositions of functions. Programmers separately specify scheduling strategies for the various functions composing the algorithm, which allows them to efficiently explore different optimizations without changing the algorithmic code. We demonstrate the power of this representation by expressing a range of recent image processing applications in an embedded domain specific language called Halide, and compiling them for ARM, x86, and GPUs. Our compiler targets SIMD units, multiple cores, and complex memory hierarchies. We demonstrate that it can handle algorithms such as a camera raw pipeline, the bilateral grid, fast local Laplacian filtering, and image segmentation. The algorithms expressed in our language are both shorter and faster than state-of-the-art implementations.
机译:使用现有的编程工具,编写高性能的图像处理代码需要牺牲可读性,可移植性和模块化。我们认为这是将计算定义算法的计算与有关存储和计算顺序的决策相混淆的结果。我们将后两个问题称为进度表,包括平铺,融合,重新计算与存储,向量化和并行性的选择。我们提出了一种前馈成像管道的表示形式,该表示形式将算法与其计划分开,从而在不牺牲代码清晰度的情况下实现了高性能。这种去耦简化了算法规范:图像和中间缓冲区成为无限整数域上的函数,没有明确的存储或边界条件。成像管道是功能的组合。程序员分别为组成算法的各种功能指定了调度策略,这使他们可以在不更改算法代码的情况下有效地探索不同的优化方法。我们通过使用称为Halide的嵌入式领域特定语言表示一系列近期图像处理应用程序,并将其编译为ARM,x86和GPU来证明这种表示的力量。我们的编译器针对SIMD单元,多个内核和复杂的内存层次结构。我们证明了它可以处理算法,例如相机原始管道,双边网格,快速局部拉普拉斯滤波和图像分割。与最先进的实现相比,用我们的语言表达的算法既短又快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号