首页> 外文会议>The 24th IEEE International Symposium on Field-Programmable Custom Computing Machines >Marathon: Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs
【24h】

Marathon: Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs

机译:马拉松:FPGA覆盖NoC上的静态调度无冲突路由

获取原文
获取原文并翻译 | 示例

摘要

We can improve the performance of deflection-routed FPGA overlay networks-on-chip (NoCs) like Hoplite by as much as 10× (random traffic) at the expense of modest extra storage cost when combining static scheduling with packet switching in an efficient, hybrid manner. Deflection routed bufferless NoCs such as Hoplite, allow extremely lightweight packet switched routers on FPGAs, but suffer from high packet latencies due to deflections under congestion. When the communication workload is known in advance, time-multiplexed routing can offer a faster alternative by eliminating deflections but require expensive storage of routing decisions in context buffers in LUT RAMs. In this paper, we propose a hybrid Marathon NoC that combines the low packet latencies of deflection-free time-multiplexed routing with the low implementation cost of context-free packet-switched Hoplite NoC. The Marathon NoC requires a deterministic routing function to be implemented in the switch along with time-stamped packet injection in the PEs to ensure deflection-free routing in the network. The network also needs a one-time offline static scheduling stage that determines the appropriate time to inject a packet to guarantee conflict-free deflection-free route on the shared network. For random traffic patterns, Marathon outperforms Hoplite by as much as 10× and time multiplexing by as much as 1.2× when considering total communication time at identical area costs. For other synthetic patterns, Marathon outperforms Hoplite in all cases except local pattern and is within 2 - 5× of best time multiplexing performance at large system sizes. For communication workloads extracted from real-world sparse matrix-vector multiplication kernels, Marathon outperforms both Hoplite and Time Multiplexing by 1.3 - 2.8×.
机译:当将静态调度与数据包交换有效结合时,我们可以将像Hoplite之类的偏转路由FPGA片上网络(NoC)的性能提高多达10倍(随机流量),以适度的额外存储成本为代价,混合方式。偏转路由的无缓冲NoC(例如Hoplite)允许在FPGA上使用极为轻便的分组交换路由器,但由于拥塞下的偏转而遭受高分组延迟。当预先知道通信工作量时,时分复用路由可以通过消除偏移来提供更快的替代方法,但需要将路由决策的昂贵存储存储在LUT RAM中的上下文缓冲区中。在本文中,我们提出了一种混合型Marathon NoC,它将无偏转的时分复用路由的低数据包延迟与无上下文的分组交换Hoplite NoC的低实现成本结合在一起。马拉松NoC要求在交换机中实现确定性路由功能以及PE中带有时间戳的数据包注入,以确保网络中的无偏移路由。网络还需要一次性离线静态调度阶段,该阶段确定适当的时间来注入数据包,以确保共享网络上的无冲突,无偏转的路由。对于随机流量模式,在考虑相同区域成本下的总通信时间时,马拉松的性能要比Hoplite的性能高10倍,时间复用的性能最高1.2倍。对于其他合成模式,马拉松在所有情况下都优于Hoplite(局部模式除外),并且在大型系统中,其最佳时分复用性能在2到5倍之内。对于从现实世界的稀疏矩阵矢量乘法内核中提取的通信工作负载,Marathon的性能优于Hoplite和Time Multiplexing 1.3-2.8×。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号