...
首页> 外文期刊>Computer architecture news >A Memory Profiling Framework for Stencil Computation on an FPGA Accelerator with High Level Synthesis
【24h】

A Memory Profiling Framework for Stencil Computation on an FPGA Accelerator with High Level Synthesis

机译:具有高级综合功能的FPGA加速器上用于模板计算的存储器性能分析框架

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we propose a framework to assist memory access optimization for stencil computation on an FPGA accelerator. Since the stencil computations such as scientific simulations need large amounts of data, efficient memory access is a key to achieving high performance on FPGA accelerators. Therefore, we implemented a stencil computation framework with a memory performance profiler on MaxCompiler, which is one of high level synthesis systems. The memory profiler enables us to measure clock cycles for various memory controller states; data transfer, stall, and idle. We also implemented simple stencil computations and practical FDTD electromagnetic field simulations on top of the framework with various parameters to evaluate and analyze memory performance. As a result of execution experiments of the simple stencil computations on a MAX34245A Data Flow Engine, it was demonstrated that approximately 70% of the peak memory performance could be achieved for various stencil types. On the other hand, the FDTD simulations, which need many data streams, could not hit this memory performance saturation point, because of increasing complexity of memory controller modules. Through the analysis of evaluation results obtained by our memory performance profiling framework, a promising memory access optimization approach for stencil computations in which the complexity of the memory controller is traded off against data access traffic is suggested.
机译:在本文中,我们提出了一个框架来协助存储器访问优化,以在FPGA加速器上进行模板计算。由于模板计算(例如科学仿真)需要大量数据,因此有效的存储器访问是在FPGA加速器上实现高性能的关键。因此,我们在MaxCompiler上实现了带有内存性能分析器的模具计算框架,这是高级综合系统之一。内存分析器使我们能够测量各种内存控制器状态的时钟周期;数据传输,停顿和空闲。我们还在框架顶部使用各种参数实施了简单的模板计算和实用的FDTD电磁场仿真,以评估和分析内存性能。在MAX34245A数据流引擎上进行简单模板计算的执行实验的结果表明,对于各种模板类型,峰值存储性能可达到约70%。另一方面,由于存储控制器模块的复杂性增加,需要许多数据流的FDTD仿真无法达到此存储性能饱和点。通过分析我们的内存性能分析框架获得的评估结果,提出了一种有前途的内存访问优化方法,用于模板计算,在该方法中,可以将内存控制器的复杂性与数据访问流量进行权衡。

著录项

  • 来源
    《Computer architecture news》 |2014年第4期|69-74|共6页
  • 作者单位

    Graduate School of Engineering Nagasaki University, Japan;

    Graduate School of Engineering Nagasaki University, Japan;

    Graduate School of Engineering Nagasaki University, Japan;

    Graduate School of Engineering Nagasaki University, Japan;

    Graduate School of Engineering Nagasaki University, Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号