【24h】

Crossbar NoCs Are Scalable Beyond 100 Nodes

机译:Crossbar NoC可扩展到100个以上的节点

获取原文
获取原文并翻译 | 示例
       

摘要

We describe the design and layout of a radix-128 crossbar in 90 nm CMOS. The data path is 32 bits wide and runs at 750 MHz using a three-stage pipeline, while fitting in a silicon area as small as 6.6 ${rm mm}^{2}$ by filling it at the 90% level. The control path occupies 7 ${rm mm}^{2}$ next to the data path by filling it at 35% level, and reconfigures the data path once every three clock cycles. Next, we arrange 128 1 ${rm mm}^{2}$ “user tiles” around the crossbar, forming a 150 ${rm mm}^{2}$ die, and we connect all tiles to the crossbar via global links running on top of the tiles. Including the overhead of repeaters and flip flops on global links, the area cost of the crossbar is 11% of the die. Thus, we prove that crossbar networks-on-chips (NoCs) are small enough for radices exceeding by far the few tens of ports, that were believed to be the practical limit up to now, and reaching above 100 ports. We also attempt a first-order comparison between our crossbar and a model of a popular mesh NoC, and we find that our crossbar NoC increases performance when traffic is global and stressed, at the cost of worse performance when traffic is local and benign. Finally, we present an experimental cost analysis showing that crossbar area practically grows as $O(N^{2}W)$, as all wiring of the crossbar fits over its standard cells, while crossbar delay grows as ${rm O}(Nsqrt W)$ , as wire length increases with the perimeter of the crossbar.
机译:我们描述了90 nm CMOS中radix-128交叉开关的设计和布局。数据路径为32位宽,使用三级流水线以750 MHz的频率运行,同时通过以90%的水平填充将其安装在硅面积小至6.6 $ {rm mm} ^ {2} $的硅区域中。通过以35%的水平填充控制路径,数据路径旁边的控制路径占据7 $ {rm mm} ^ {2} $,并且每三个时钟周期重新配置一次数据路径。接下来,我们在横杆周围排列128 1个$ {rm mm} ^ {2} $个“用户图块”,形成150个$ {rm mm} ^ {2} $模子,然后通过全局链接将所有图块连接到横杆上在瓷砖上运行。包括全局链路上的中继器和触发器的开销,交叉开关的面积成本为裸片的11%。因此,我们证明了交叉开关片上网络(NoC)足够小,以至于半径超过了几十个端口(迄今为止被认为是实际的限制),并且达到了100个以上的端口。我们还尝试将交叉开关与流行的网状NoC模型进行一阶比较,我们发现,交叉通信NoC可以在交通流量大且压力大的情况下提高性能,但会在交通流量为本地和良性时降低性能。最后,我们提供了一项实验成本分析,结果表明,随着交叉开关的所有布线都适合其标准单元,交叉开关面积实际上随着$ O(N ^ {2} W)$增长,而交叉开关延迟则随着$ {rm O}( Nsqrt W)$,这是因为导线的长度随交叉开关的周长而增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号