首页> 外文会议>2017 IEEE 23rd Symposium on High Performance Computer Architecture >Needle: Leveraging Program Analysis to Analyze and Extract Accelerators from Whole Programs
【24h】

Needle: Leveraging Program Analysis to Analyze and Extract Accelerators from Whole Programs

机译:Needle:利用程序分析从整个程序中分析和提取加速器

获取原文
获取原文并翻译 | 示例

摘要

Technology constraints have increasingly led to the adoption of specialized coprocessors, i.e. hardware accelerators. The first challenge that computer architects encounter is identifying "what to specialize in the program". We demonstrate that this requires precise enumeration of program paths based on dynamic program behavior. We hypothesize that path-based [4] accelerator offloading leads to good coverage of dynamic instructions and improve energy efficiency. Unfortunately, hot paths across programs demonstrate diverse control flow behavior. Accelerators (typically based on dataflow execution), often lack an energy-efficient, complexity effective, and high performance (eg. branch prediction) support for control flow. We have developed NEEDLE, an LLVM based compiler framework that leverages dynamic profile information to identify, merge, and offload acceleratable paths from whole applications. NEEDLE derives insight into what code coverage (and consequently energy reduction) an accelerator can achieve. We also develop a novel program abstraction for offload calledBraid, that merges common code regions across different paths to improve coverage of the accelerator while trading off the increase in dataflow size. This enables coarse grained offloading, reducing interaction with the host CPU core. To prepare the Braids and paths for acceleration, NEEDLE generates software frames. Software frames enable energy efficient speculative execution on accelerators. They are accelerator microarchitecture independent support speculative execution including memory operations. NEEDLE is automated and has been used to analyze 225K paths across 29 workloads. It filtered and ranked 154K paths for acceleration across unmodified SPEC, PARSEC and PERFECT workload suites. We target NEEDLE's offload regions toward a CGRA and demonstrate 34% performance and 20% energy improvement.
机译:技术限制越来越多地导致采用专用协处理器,即硬件加速器。计算机架构师遇到的第一个挑战是确定“该程序的专业知识”。我们证明这需要基于动态程序行为的程序路径的精确枚举。我们假设基于路径的[4]加速器卸载可以很好地覆盖动态指令并提高能源效率。不幸的是,跨程序的热路径展示了多种控制流行为。加速器(通常基于数据流执行)通常缺乏对控制流的高能效,高效复杂性和高性能(例如分支预测)支持。我们已经开发了NEEDLE,这是一个基于LLVM的编译器框架,该框架利用动态配置文件信息来识别,合并和卸载整个应用程序中的可加速路径。 NEEDLE深入了解了加速器可以实现的代码覆盖范围(以及因此的节能)。我们还为卸载开发了一种新颖的程序抽象,称为Braid,该程序抽象了跨不同路径的通用代码区域,以改善加速器的覆盖范围,同时权衡数据流大小的增加。这样可以实现粗粒度的卸载,从而减少了与主机CPU内核的交互。为了准备加速的辫子和路径,NEEDLE生成软件框架。软件框架可在加速器上实现节能的投机执行。它们是与加速器微体系结构无关的支持推测执行,包括内存操作。 NEEDLE是自动化的,已用于分析29个工作负载中的225K路径。它对未修改的SPEC,PARSEC和PERFECT工作负载套件中的加速进行了筛选和排序的154K路径。我们将NEEDLE的卸载区域定位为CGRA,并证明其性能提高了34%,能耗提高了20%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号