首页> 外文期刊>Journal of electronic imaging >Design of a pseudo-log image transform hardware accelerator in a high-level synthesis-based memory management framework
【24h】

Design of a pseudo-log image transform hardware accelerator in a high-level synthesis-based memory management framework

机译:基于高级合成的内存管理框架中伪日志图像转换硬件加速器的设计

获取原文
获取原文并翻译 | 示例
       

摘要

The pseudo-log image transform belongs to a class of image processing kernels that generate memory references which are nonlinear functions of loop indices. Due to the nonlinearity of the memory references, the usual design methodologies do not allow efficient hardware implementation for nonlinear kernels. For optimized hardware implementation, these kernels require the creation of a customized memory hierarchy and efficient data/memory management strategy. We present the design and real-time hardware implementation of a pseudo-log image transform IP (hardware image processing engine) using a memory management framework. The framework generates a controller which efficiently manages input data movement in the form of tiles between off-chip main memory, on-chip memory, and the core processing unit. The framework can jointly optimize the memory hierarchy and the tile computation schedule to reduce on-chip memory requirements, to maximize throughput, and to increase data reuse for reducing off-chip memory bandwidth requirements. The algorithmic C++ description of the pseudo-log kernel is profiled in the framework to generate an enhanced description with a customized memory hierarchy. The enhanced description of the kernel is then used for high-level synthesis (HLS) to perform architectural design space exploration in order to find an optimal implementation under given performance constraints. The optimized register transfer level implementation of the IP generated after HLS is used for performance estimation. The performance estimation is done in a simulation framework to characterize the IP with different external off-chip memory latencies and a variety of data transfer policies. Experimental results show that the designed IP can be used for real-time implementation and that the generated memory hierarchy is capable of feeding the IP with a sufficiently high bandwidth even in the presence of long external memory latencies. (C) 2014 SPIE and IS&T
机译:伪对数图像变换属于一类图像处理内核,这些内核会生成内存引用,这些引用是循环索引的非线性函数。由于内存引用的非线性,通常的设计方法不允许对非线性内核进行有效的硬件实现。为了优化硬件实现,这些内核需要创建自定义的内存层次结构和有效的数据/内存管理策略。我们介绍了使用内存管理框架的伪日志图像转换IP(硬件图像处理引擎)的设计和实时硬件实现。该框架生成一个控制器,该控制器以片外主存储器,片上存储器和核心处理单元之间的图块形式有效地管理输入数据的移动。该框架可以共同优化存储器层次结构和切片计算计划,以减少片上存储器需求,最大化吞吐量并增加数据重用性以减少片外存储器带宽需求。在框架中分析了伪日志内核的算法C ++描述,以生成具有自定义内存层次结构的增强描述。内核的增强描述然后用于高级综合(HLS)以执行体系结构设计空间探索,以便在给定的性能约束下找到最佳实现。 HLS之后生成的IP的优化寄存器传输级别实现用于性能评估。性能评估是在仿真框架中完成的,以使用不同的外部片外存储器延迟和各种数据传输策略来表征IP。实验结果表明,所设计的IP可以用于实时实现,并且所生成的存储器层次结构即使在存在较长的外部存储器等待时间的情况下也能够为IP提供足够高的带宽。 (C)2014 SPIE和IS&T

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号