首页> 外文会议>IEEE/ACM International Symposium on Networks-on-Chip >Implementing Low-Diameter On-Chip Networks for Manycore Processors Using a Tiled Physical Design Methodology
【24h】

Implementing Low-Diameter On-Chip Networks for Manycore Processors Using a Tiled Physical Design Methodology

机译:使用平铺的物理设计方法为Manycore处理器实现低直径的片上网络

获取原文

摘要

Manycore processors are now integrating up to 1000 simple cores into a single die, yet these processors still rely on high-diameter mesh on-chip networks (OCNs) without complex flow-control nor custom circuits due to three reasons: (1) manycores require simple, low-area routers; (2) manycores usually use standard-cell-based design; and (3) manycores use a tiled physical design methodology. In this paper, we explore mesh and torus topologies with internal concentration and/or ruche channels that require low area overhead and can be implemented using a traditional standard-cell-based tiled physical design methodology. We use a combination of analytical and RTL modeling along with layout-level results for both hard macros and a 3×3mm 256terminal OCN in a 14-nm technology for twelve topologies. Critically, the networks we study use a tiled physical design methodology meaning they: (1) tile a homogeneous hard macro across the chip; (2) implement chip top-level routing between hard macros via short wires to neighboring macros; and (3) use timing closure for the hard macro to quickly close timing at the chip top-level. Our results suggest that a concentration factor of four and a ruche factor of two in a 2D-mesh topology can reduce latency by over 2× at similar area and bisection bandwidth for both small and large messages compared to a 2D-mesh baseline.
机译:Manycore处理器现在将多达1000个简单内核集成到一个裸片中,但是由于以下三个原因,这些处理器仍依赖高直径的网状片上网络(OCN),而无需复杂的流控制或自定义电路:(1)简单的低区域路由器; (2)许多核心通常使用基于标准单元的设计; (3)多核使用平铺的物理设计方法。在本文中,我们探索具有内部集中度和/或ruche通道的网状和环形拓扑,这些通道需要较低的面积开销,并且可以使用传统的基于标准单元的平铺物理设计方法来实现。我们将解析和RTL建模以及布局级别的结果与硬宏和14纳米技术的3×3mm 256端子OCN结合使用,用于十二种拓扑结构。至关重要的是,我们研究的网络使用了平铺的物理设计方法,这意味着它们:(1)在芯片上平铺同质的硬宏; (2)在硬宏之间通过短线实现到相邻宏的芯片顶层路由; (3)对硬宏使用时序收敛,以快速关闭芯片顶层时序。我们的结果表明,与2D网格基线相比,在2D网格拓扑中,对于小消息和大消息,在相似的面积和对分带宽下,4的集中度因子和2的处理因子可以将等待时间减少2倍以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号