【24h】

TTA-SIMD Soft Core Processors

机译:Tta-Sind软核处理器

获取原文

摘要

Soft processors are an important tool in the Field Programmable Gate Array (FPGA) designer's toolkit, and their Single Instruction Multiple Data (SIMD) organizations are an efficient means to utilize the parallelism of FPGAs. However, the state-of-the-art SIMD processors are hindered by the additional logic complexity resulting from dynamic features. By minimizing such constructs, it is possible to design soft processors that are efficient but still flexible enough to operate within an application domain. To this end, we propose a family of instruction set programmable multi-issue wide SIMD soft cores. The template is based on a highly static Transport Triggered Architecture (TTA) and a design time customizable shuffle unit to minimize inefficient dynamic features while remaining compiler programmable. The cores are evaluated on the PYNQ-Z1 board against the ARM A9 hard processor system with NEON vector extensions. The proposed cores reach up to 2.4x performance improvement over the ARM, can fit up to 1024 bit wide SIMD units onto the relatively small FPGA, while still operating at above 100 MHz. The scalability of TTA enables state of the art vector widths. The multicore scalability of the template is preliminarily tested with a 14-core design on a XCZU9EG FPGA customized for real-time convolutional neural net inference.
机译:软处理器是现场可编程门阵列(FPGA)设计人员工具包中的重要工具,其单指令多数据(SIMD)组织是利用FPGA并行性的有效手段。但是,动态功能导致的额外逻辑复杂性阻碍了最新的SIMD处理器。通过最小化此类构造,可以设计出高效但仍然足够灵活以在应用程序域内运行的软处理器。为此,我们提出了一组指令集可编程的多问题宽SIMD软核。该模板基于高度静态的传输触发体系结构(TTA)和设计时可定制的混洗单元,以最大程度地降低低效的动态功能,同时保持编译器可编程。在PYNQ-Z1板上针对具有NEON矢量扩展名的ARM A9硬核处理器系统对内核进行了评估。拟议的内核比ARM的性能提高了2.4倍,可以在相对较小的FPGA上安装多达1024位宽的SIMD单元,同时仍在100 MHz以上的频率下运行。 TTA的可扩展性可实现最新的矢量宽度。模板的多核可扩展性已在针对实时卷积神经网络推理而定制的XCZU9EG FPGA上以14核设计进行了初步测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号