首页> 外文会议>IEEE Vehicular Technology Conference >Analysis and Implementation of the Semi-Global Matching 3D Vision Algorithm Using Code Transformations and High-Level Synthesis
【24h】

Analysis and Implementation of the Semi-Global Matching 3D Vision Algorithm Using Code Transformations and High-Level Synthesis

机译:使用代码转换和高级综合的半全局匹配3D视觉算法的分析和实现

获取原文

摘要

High-level synthesis (HLS) offers several advantages, such as faster simulation run-time and better design re-use, thanks to the higher level of abstraction. This work uses HLS to implement the Semi-Global Matching (SGM) algorithm, which is frequently used in stereo vision systems, e.g. for automotive applications. The hardware implementation is based on a Xilinx Virtex 7 FPGA. The initial algorithmic “golden” model used very large arrays, which had to be mapped to an external DRAM and brought into the on-chip RAM of the FPGA on demand. This required both adding the memory transfer loops and inserting calls to the AXI transactors that access the DRAM through the on-chip DDR slave. Moreover, the initial single-threaded algorithm had to be parallelized, by converting the top-level sweeps of the image in eight directions into as many threads. The access to the DRAM was then managed with a centralized controller. This modified SystemC design proved to be suitable to achieve the target real-time performance. The design space was thus explored by making several fairly different micro-architectural choices. In the end, it was possible to obtain an implementation which is comparable to a very efficient (and hence very inflexible) manual RTL design that had been previously developed, including a very sophisticated fine-grained management of data and computation.
机译:高级抽象(HLS)具有更高的抽象级别,因此具有多个优势,例如更快的仿真运行时间和更好的设计重用性。这项工作使用HLS来实现半全局匹配(SGM)算法,该算法通常在立体视觉系统中使用,例如用于汽车应用。硬件实现基于Xilinx Virtex 7 FPGA。最初的算法“黄金”模型使用非常大的阵列,必须将其映射到外部DRAM,然后按需将其带入FPGA的片上RAM。这既需要添加存储器传输循环,又需要插入对通过片上DDR从设备访问DRAM的AXI事务处理程序的调用。此外,必须将初始的单线程算法并行化,方法是将图像在八个方向上的顶级扫描转换为尽可能多的线程。然后使用集中控制器管理对DRAM的访问。事实证明,这种经过修改的SystemC设计适合实现目标实时性能。因此,通过做出几种相当不同的微体系结构选择来探索设计空间。最后,可以获得与以前开发的非常高效(因此非常不灵活)的手动RTL设计相当的实现,其中包括非常复杂的数据和计算的细粒度管理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号