...
首页> 外文期刊>ACM Transactions on Design Automation of Electronic Systems >Synthesis of Networks of Custom Processing Elements for Real-Time Physical System Emulation
【24h】

Synthesis of Networks of Custom Processing Elements for Real-Time Physical System Emulation

机译:用于实时物理系统仿真的自定义处理元素网络的综合

获取原文
获取原文并翻译 | 示例
           

摘要

Emulating a physical system in real-time or faster has numerous applications in cyber-physical system design and deployment. For example, testing of a cyber-device's software (e.g., a medical ventilator) can be done via interaction with a real-time digital emulation of the target physical system (e.g., a human's respiratory system). Physical system emulation typically involves iteratively solving thousands of ordinary differential equations (ODEs) that model the physical system. We describe an approach that creates custom processing elements (PEs) specialized to the ODEs of a particular model while maintaining some programmability, targeting implementation on field-programmable gate arrays (FPGAs). We detail the PE micro-architecture and accompanying automated compilation and synthesis techniques. Furthermore, we describe our efforts to use a high-level synthesis approach that incorporates regularity extraction techniques as an alternative FPGA-based solution, and also describe an approach using graphics processing units (GPUs). We perform experiments with five models: a Weibel lung model, a Lutchen lung model, an atrial heart model, a neuron model, and a wave model; each model consists of several thousand ODEs and targets a Xilinx Virtex 6 FPGA. Results of the experiments show that the custom PE approach achieves 4X-9X speedups (average 6.7X) versus our previous general ODE-solver PE approach, and 7X-10X speedups (average 8.7X) versus high-level synthesis, while using approximately the same or fewer FPGA resources. Furthermore, the approach achieves speedups of 18X-32X (average 26X) versus an Nvidia GTX 460 GPU, and average speedups of more than 100X compared to a six-core TI DSP processor or a four-core ARM processor, and 24X versus an Intel I7 quad core processor running at 3.06 GHz. While an FPGA implementation costs about 3X-5X more than the non-FPGA approaches, a speedup/dollar analysis shows 10X improvement versus the next best approach, with the trend of decreasing FPGA costs improving speedup/dollar in the future.
机译:实时或更快地仿真物理系统在网络物理系统设计和部署中具有众多应用程序。例如,可以通过与目标物理系统(例如人的呼吸系统)的实时数字仿真进行交互来完成对网络设备软件(例如医疗呼吸机)的测试。物理系统仿真通常涉及迭代求解成千上万个对物理系统建模的常微分方程(ODE)。我们描述了一种方法,该方法创建专用于特定模型的ODE的自定义处理元素(PE),同时保持一些可编程性,并针对在现场可编程门阵列(FPGA)上的实现。我们详细介绍了PE微体系结构以及随附的自动编译和综合技术。此外,我们描述了使用融合规则性提取技术的高级综合方法​​作为基于FPGA的替代解决方案的努力,并且还描述了使用图形处理单元(GPU)的方法。我们用五个模型进行实验:Weibel肺模型,Lutchen肺模型,心房模型,神经元模型和波动模型;每个模型都包含数千个ODE,并以Xilinx Virtex 6 FPGA为目标。实验结果表明,与我们以前的通用ODE求解器PE方法相比,自定义PE方法实现了4X-9X的加速(平均6.7X),而与高级综合方法​​相比则达到了7X-10X的加速(平均8.7X),而使用了大约相同或更少的FPGA资源。此外,与Nvidia GTX 460 GPU相比,该方法可实现18X-32X的加速(平均26X),与六核TI DSP处理器或四核ARM处理器相比,平均加速可超过100X,而与Intel相比则可达到24X。 I7四核处理器,运行频率为3.06 GHz。尽管FPGA实施的成本比非FPGA方法高约3到5倍,但加速/美元分析显示,与次之的最佳方法相比,提高了10倍,并且FPGA成本下降的趋势是将来会提高加速/美元。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号