Implementing Low-Diameter On-Chip Networks for Manycore Processors Using a Tiled Physical Design Methodology

机译：使用平铺的物理设计方法为Manycore处理器实现低直径的片上网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Manycore processors are now integrating up to 1000 simple cores into a single die, yet these processors still rely on high-diameter mesh on-chip networks (OCNs) without complex flow-control nor custom circuits due to three reasons: (1) manycores require simple, low-area routers; (2) manycores usually use standard-cell-based design; and (3) manycores use a tiled physical design methodology. In this paper, we explore mesh and torus topologies with internal concentration and/or ruche channels that require low area overhead and can be implemented using a traditional standard-cell-based tiled physical design methodology. We use a combination of analytical and RTL modeling along with layout-level results for both hard macros and a 3×3mm 256terminal OCN in a 14-nm technology for twelve topologies. Critically, the networks we study use a tiled physical design methodology meaning they: (1) tile a homogeneous hard macro across the chip; (2) implement chip top-level routing between hard macros via short wires to neighboring macros; and (3) use timing closure for the hard macro to quickly close timing at the chip top-level. Our results suggest that a concentration factor of four and a ruche factor of two in a 2D-mesh topology can reduce latency by over 2× at similar area and bisection bandwidth for both small and large messages compared to a 2D-mesh baseline.

机译：Manycore处理器现在将多达1000个简单内核集成到一个裸片中，但是由于以下三个原因，这些处理器仍依赖高直径的网状片上网络（OCN），而无需复杂的流控制或自定义电路：（1）简单的低区域路由器; （2）许多核心通常使用基于标准单元的设计; （3）多核使用平铺的物理设计方法。在本文中，我们探索具有内部集中度和/或ruche通道的网状和环形拓扑，这些通道需要较低的面积开销，并且可以使用传统的基于标准单元的平铺物理设计方法来实现。我们将解析和RTL建模以及布局级别的结果与硬宏和14纳米技术的3×3mm 256端子OCN结合使用，用于十二种拓扑结构。至关重要的是，我们研究的网络使用了平铺的物理设计方法，这意味着它们：（1）在芯片上平铺同质的硬宏; （2）在硬宏之间通过短线实现到相邻宏的芯片顶层路由; （3）对硬宏使用时序收敛，以快速关闭芯片顶层时序。我们的结果表明，与2D网格基线相比，在2D网格拓扑中，对于小消息和大消息，在相似的面积和对分带宽下，4的集中度因子和2的处理因子可以将等待时间减少2倍以上。

著录项

来源
《IEEE/ACM International Symposium on Networks-on-Chip》|2020年|1-8|共8页
会议地点
作者
Yanghui Ou; Shady Agwa; Christopher Batten;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Physical design methodology for on-chip 63-Mb DRAM MPEG-2 encoding with a multimedia processor [J] . Rei Akiyama, Hidehiro Takata, Tadao Yamanaka, 電子情報通信学会技術研究報告. VLSI設計技術. VLSI Design Technologies . 2001,第467期

机译：带有多媒体处理器的片上63 Mb DRAM MPEG-2编码的物理设计方法
2. Physical design methodology for on-chip 63-Mb DRAM MPEG-2 encoding with a multimedia processor [J] . Rei Akiyama, Hidehiro Takata, Tadao Yamanaka, 電子情報通信学会技術研究報告. 集積回路. Integrated Circuits and Devices . 2001,第470期

机译：带有多媒体处理器的片上63 Mb DRAM MPEG-2编码的物理设计方法
3. Physical design methodology for on-chip 63-Mb DRAM MPEG-2 encoding with a multimedia processor [J] . Rei Akiyama, Hidehiro Takata, Tadao Yamanaka, 電子情報通信学会技術研究報告. フォ-ルトトレラントシステム . 2001,第476期

机译：带有多媒体处理器的片上63 Mb DRAM MPEG-2编码的物理设计方法
4. Streaming Tiles: Flexible Implementation of Convolution Neural Networks Inference on Manycore Architectures [C] . Nesma M. Rezk, Madhura Purnaprajna, Zain Ul-Abdin IEEE International Parallel and Distributed Processing Symposium Workshops . 2018

机译：流图块：Manycore架构上卷积神经网络推断的灵活实现
5. Inter/intra-chip optical network design for manycore processors. [D] . Wu, Xiaowen. 2014

机译：用于多核处理器的芯片间/芯片内光网络设计。
6. Design and Implementation of an On-Chip Low-Power and High-Flexibility System for Data Acquisition and Processing of an Inertial Measurement Unit [O] . Zhenyi Gao, Bin Zhou, Yang Li, 2020

机译：惯性测量单元数据采集和处理的片上低功耗高灵活性系统的设计与实现
7. Design and Implementation of an On-Chip Permutation Network for Multiprocessor System-On-Chip [O] . Phi-hung Pham, Junyoung Song, Jongsun Park, 2010

机译：多处理器片上系统片上置换网络的设计与实现

Implementing Low-Diameter On-Chip Networks for Manycore Processors Using a Tiled Physical Design Methodology

摘要

著录项

相似文献

相关主题

期刊订阅