首页> 外文会议>ACM SIGPLAN 2002 conference on programming language design and implementation(PLDI'02) >A compiler approach to fast hardware design space exploration in FPGA-based systems
【24h】

A compiler approach to fast hardware design space exploration in FPGA-based systems

机译:基于FPGA的系统中快速硬件设计空间探索的编译器方法

获取原文

摘要

The current practice of mapping computations to custom hardware implementations requires programmers to assume the role of hardware designers. In tuning the performance of their hardware implementation, designers manually apply loop transformations such as loop unrolling. designers manually apply loop transformations. For example, loop unrolling is used to expose instruction-level parallelism at the expense of more hardware resources for concurrent operator evaluation. Because unrolling also increases the amount of data a computation requires, too much unrolling can lead to a memory bound implementation where resources are idle. To negotiate inherent hardware space-time trade-offs, designers must engage in an iterative refinement cycle, at each step manually applying transformations and evaluating their impact. This process is not only error-prone and tedious but also prohibitively expensive given the large search spaces and with long synthesis times. This paper describes an automated approach tohardware design space exploration, through a collaboration between parallelizing compiler technology and high-level synthesis tools. We present a compiler algorithm that automatically explores the large design spaces resulting from the application of several program transformations commonly used in application-specific hardware designs. Our approach uses synthesis estimation techniques to quantitatively evaluate alternate designs for a loop nest computation. We have implemented this design space exploration algorithm in the context of a compilation and synthesis system called DEFACTO, and present results of this implementation on five multimedia kernels. Our algorithm derives an implementation that closely matches the performance of the fastest design in the design space, and among implementations with comparable performance, selects the smallest design. We search on average only 0.3% of the design space. This technology thus significantly raises the level of abstraction for hardware design and explores a design space much larger than is feasible for a human designer.
机译:将计算映射到自定义硬件实现的当前实践要求程序员担当硬件设计者的角色。在调整其硬件实现的性能时,设计人员可以手动应用循环转换,例如循环展开。设计人员手动应用循环转换。例如,循环展开用于公开指令级并行性,但要付出更多的硬件资源来进行并行运算符评估。由于展开还会增加计算所需的数据量,因此展开过多会导致内存空闲的内存实现。为了协商固有的硬件时空权衡,设计人员必须参与迭代的优化周期,在每一步中,手动应用转换并评估其影响。考虑到较大的搜索空间和较长的合成时间,该过程不仅容易出错而且乏味,而且成本过高。本文通过并行化编译器技术和高级综合工具之间的协作,描述了一种自动化的硬件设计空间探索方法。我们提出了一种编译器算法,该算法可自动探索由于特定于应用程序的硬件设计中常用的几种程序转换的应用而产生的大型设计空间。我们的方法使用综合估计技术来定量评估备用设计,以进行循环嵌套计算。我们已经在称为DEFACTO的编译和综合系统的背景下实现了这种设计空间探索算法,并在五个多媒体内核上展示了该实现的结果。我们的算法得出的实现与设计空间中最快的设计的性能非常匹配,并且在性能可比的实现中选择最小的设计。我们平均只搜索设计空间的0.3%。因此,这项技术显着提高了硬件设计的抽象水平,并探索了一个比人类设计师可行的设计空间大得多的设计空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号