首页> 外文学位 >Efficient Hardware Acceleration on SoC-FPGA with OpenCL
【24h】

Efficient Hardware Acceleration on SoC-FPGA with OpenCL

机译:带有OpenCL的SoC-FPGA上的高效硬件加速

获取原文
获取原文并翻译 | 示例

摘要

Field Programmable Gate Arrays (FPGAs) are taking over the conventional processors in the field of High Performance computing. With the advent of FPGA architectures and High level synthesis tools, FPGAs can now be easily used to accelerate computationally intensive applications like, e.g., AI and Cognitive computing. One of the advantages of raising the level of hardware design abstraction is that multiple configurations with unique properties (i.e. area, performance and power) can be automatically generated without the need to re-write the input description. This is not possible when using traditional low-level hardware description languages like VHDL or Verilog. This thesis deals with this important topic and accelerates multiple computationally intensive applications amiable to hardware acceleration and proposes a fast heuristic Design Space Exploration method to find dominant design solutions quickly.;In particular, in this work, we developed different computationally intensive applications in OpenCL and mapped them onto a heterogeneous SoC-FPGA. A Genetic Algorithm (GA) based meta-heuristics that does automatic Design Space Exploration (DSE) on these applications was also developed as GA has shown in the past to lead to good results in multi-objective optimization problems like this one. The developed explorer automatically inserts a set of control knobs into the OpenCL behavioral description, e.g., to control how to synthesize loops (unroll or not), and to replicate Compute Units (CUs). By tuning the these control attributes with possible values, thousands of different micro-architecture configurations can be obtained. Thus, an exhaustive search is not feasible and other heuristics are needed. Each configuration is compiled using Altera OpenCL SDK tool and executed on Terasic DE1-SoC FPGA board platform to record the corresponding performance and logic utilization. In order to measure the quality of the proposed GA-based heuristic, each application is explored exhaustively (taking multiple days to finish for smaller designs) to find the dominant optimal solutions (Pareto Optimal Designs). For complex and larger designs, exploring the entire design space exhaustively is not feasible due to very large design space. The comparison is quantified by using metrics like Dominance, Average Distance from Reference Set (ADRS) and run time speed up, showing that our proposed heuristics lead to very good results at a fraction of the time of the exhaustive search.
机译:现场可编程门阵列(FPGA)取代了高性能计算领域的传统处理器。随着FPGA体系结构和高级综合工具的出现,FPGA现在可以轻松用于加速计算密集型应用程序,例如AI和认知计算。提高硬件设计抽象水平的优点之一是可以自动生成具有独特属性(即面积,性能和功耗)的多种配置,而无需重新编写输入描述。使用传统的低级硬件描述语言(例如VHDL或Verilog)时,这是不可能的。本论文致力于解决这一重要问题,并加速了许多适合硬件加速的计算密集型应用程序,并提出了一种快速的启发式设计空间探索方法,以快速找到主导的设计解决方案。特别是,在这项工作中,我们在OpenCL和将它们映射到异构SoC-FPGA。还开发了一种基于遗传算法(GA)的元启发式算法,可对这些应用程序进行自动设计空间探索(DSE),因为GA过去已证明可在此类多目标优化问题中取得良好的结果。开发的资源管理器会自动将一组控制旋钮插入OpenCL行为描述中,例如,控制如何合成循环(是否展开)以及复制计算单元(CU)。通过用可能的值调整这些控制属性,可以获得数千种不同的微体系结构配置。因此,穷举搜索是不可行的,并且需要其他启发式方法。每种配置都使用Altera OpenCL SDK工具进行编译,并在Terasic DE1-SoC FPGA板平台上执行,以记录相应的性能和逻辑利用率。为了衡量所提出的基于GA的启发式算法的质量,我们对每个应用程序进行了详尽的探索(对于较小的设计,需要花费几天的时间才能完成),以找到占主导地位的最佳解决方案(Pareto最佳设计)。对于复杂的大型设计,由于非常大的设计空间,因此无法穷举地探索整个设计空间是不可行的。通过使用诸如优势度,与参考集的平均距离(ADRS)和运行时间加速之类的指标来量化比较,这表明我们提出的启发式方法在穷举搜索的一小部分时间内会产生非常好的结果。

著录项

  • 作者

    Gogineni, Susmitha.;

  • 作者单位

    The University of Texas at Dallas.;

  • 授予单位 The University of Texas at Dallas.;
  • 学科 Electrical engineering.;Computer science.
  • 学位 M.S.E.E.
  • 年度 2017
  • 页码 68 p.
  • 总页数 68
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 康复医学;
  • 关键词

  • 入库时间 2022-08-17 11:38:47

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号