首页> 外文会议>International Conference on Field-Programmable Technology >High-level synthesis of dynamic data structures: A case study using Vivado HLS
【24h】

High-level synthesis of dynamic data structures: A case study using Vivado HLS

机译:动态数据结构的高级别合成:使用Vivado HLS的案例研究

获取原文

摘要

High-level synthesis promises a significant shortening of the FPGA design cycle when compared with design entry using register transfer level (RTL) languages. Recent evaluations report that C-to-RTL flows can produce results with a quality close to hand-crafted designs [1]. Algorithms which use dynamic, pointer-based data structures, which are common in software, remain difficult to implement well. In this paper, we describe a comparative case study using Xilinx Vivado HLS as an exemplary state-of-the-art high-level synthesis tool. Our test cases are two alternative algorithms for the same compute-intensive machine learning technique (clustering) with significantly different computational properties. We compare a data-flow centric implementation to a recursive tree traversal implementation which incorporates complex data-dependent control flow and makes use of pointer-linked data structures and dynamic memory allocation. The outcome of this case study is twofold: We confirm similar performance between the hand-written and automatically generated RTL designs for the first test case. The second case reveals a degradation in latency by a factor greater than 30× if the source code is not altered prior to high-level synthesis. We identify the reasons for this shortcoming and present code transformations that narrow the performance gap to a factor of four. We generalise our source-to-source transformations whose automation motivates research directions to improve high-level synthesis of dynamic data structures in the future.
机译:使用寄存器传输级别(RTL)语言的设计条目相比,高级合成承诺对FPGA设计周期的显着缩短。最近的评估报告说,C-To-RTL流量可以产生靠近手工制作设计的质量的结果[1]。使用动态的算法,基于指针的数据结构,它在软件中常见,仍然很难实现。在本文中,我们描述了使用Xilinx Vivado HLS作为示例性最先进的高级合成工具的比较案例研究。我们的测试用例是同一计算密集型机器学习技术(聚类)的两个替代算法,具有显着不同的计算属性。我们将数据流为中心实现与递归树遍历实现进行比较,该实现包含复杂的数据相关的控制流程,并利用指针链接数据结构和动态存储器分配。这种情况研究的结果是双重的:我们在手写和自动生成第一个测试用例的RTL设计之间确认类似的性能。第二种情况显示,如果在高级合成之前没有改变源代码,则延迟的潜伏期大于30倍。我们确定这种缺点和当前代码转换的原因,将性能差距缩小到四倍。我们概括了我们的来源转换,其自动化激励了研究方向,以提高未来动态数据结构的高级别合成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号