Design and Performance Optimization of Asynchronous Networks-on-Chip

机译：异步片上网络的设计与性能优化

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

As digital systems continue to grow in complexity, the design of conventional synchronous systems is facing unprecedented challenges. The number of transistors on individual chips is already in the multi-billion range, and a greatly increasing number of components are being integrated onto a single chip. As a consequence, modern digital designs are under strong time-to-market pressure, and there is a critical need for composable design approaches for large complex systems.;In the past two decades, networks-on-chip (NoC's) have been a highly active research area. In a NoC-based system, functional blocks are first designed individually and may run at different clock rates. These modules are then connected through a structured network for on-chip global communication. However, due to the rigidity of centrally-clocked NoC's, there have been bottlenecks of system scalability, energy and performance, which cannot be easily solved with synchronous approaches. As a result, there has been significant recent interest in combing the notion of asynchrony with NoC designs. Since the NoC approach inherently separates the communication infrastructure, and its timing, from computational elements, it is a natural match for an asynchronous paradigm. Asynchronous NoC's, therefore, enable a modular and extensible system composition for an 'object-orient' design style.;The thesis aims to significantly advance the state-of-art and viability of asynchronous and globally-asynchronous locally-synchronous (GALS) networks-on-chip, to enable high-performance and low-energy systems. The proposed asynchronous NoC's are nearly entirely based on standard cells, which eases their integration into industrial design flows. The contributions are instantiated in three different directions.;First, practical acceleration techniques are proposed for optimizing the system latency, in order to break through the latency bottleneck in the memory interfaces of many on-chip parallel processors. Novel asynchronous network protocols are proposed, along with concrete NoC designs. A new concept, called 'monitoring network', is introduced. Monitoring networks are lightweight shadow networks used for fast-forwarding anticipated traffic information, ahead of the actual packet traffic. The routers are therefore allowed to initiate and perform arbitration and channel allocation in advance. The technique is successfully applied to two topologies which belong to two different categories -- a variant mesh-of-trees (MoT) structure and a 2D-mesh topology. Considerable and stable latency improvements are observed across a wide range of traffic patterns, along with moderate throughput gains.;Second, for the first time, a high-performance and low-power asynchronous NoC router is compared directly to a leading commercial synchronous counterpart in an advanced industrial technology. The asynchronous router design shows significant performance improvements, as well as area and power savings. The proposed asynchronous router integrates several advanced techniques, including a low-latency circular FIFO for buffer design, and a novel end-to-end credit-based virtual channel (VC) flow control. In addition, a semi-automated design flow is created, which uses portions of a standard synchronous tool flow.;Finally, a high-performance multi-resource asynchronous arbiter design is developed. This small but important component can be directly used in existing asynchronous NoC's for performance optimization. In addition, this standalone design promises use in opening up new NoC directions, as well as for general use in parallel systems. In the proposed arbiter design, the allocation of a resource to a client is divided into several steps. Multiple successive client-resource pairs can be selected rapidly in pipelined sequence, and the completion of the assignments can overlap in parallel.;In sum, the thesis provides a set of advanced design solutions for performance optimization of asynchronous and GALS networks-on-chip. These solutions are at different levels, from network protocols, down to router- and component-level optimizations, which can be directly applied to existing basic asynchronous NoC designs to provide a leap in performance improvement.

机译：随着数字系统复杂性的不断提高，常规同步系统的设计面临着前所未有的挑战。单个芯片上的晶体管数量已经在数十亿个范围内，并且越来越多的组件被集成到单个芯片上。结果，现代数字设计面临着巨大的上市时间压力，并且迫切需要用于大型复杂系统的可组合设计方法。在过去的二十年中，片上网络（NoC）一直是一种趋势。高度活跃的研究领域。在基于NoC的系统中，功能块首先进行单独设计，并且可以不同的时钟速率运行。然后，这些模块通过结构化网络连接，以进行片上全局通信。但是，由于中央时钟NoC的刚性，已经存在系统可伸缩性，能耗和性能方面的瓶颈，而采用同步方法则无法轻松解决这些瓶颈。结果，近来人们对将异步概念与NoC设计结合起来产生了极大的兴趣。由于NoC方法从本质上将通信基础结构及其时序与计算元素分开，因此它是异步范例的自然匹配。因此，异步NoC为“面向对象”的设计风格提供了模块化且可扩展的系统组成。本文旨在显着提高异步和全局异步本地同步（GALS）网络的技术水平和可行性。片上，以实现高性能和低能耗的系统。提议的异步NoC几乎完全基于标准单元，这简化了它们到工业设计流程中的集成。贡献是在三个不同的方向上实例化的：首先，提出了用于优化系统延迟的实用加速技术，以突破许多片上并行处理器的存储器接口中的延迟瓶颈。提出了新颖的异步网络协议，以及具体的NoC设计。引入了一个称为“监视网络”的新概念。监视网络是轻型影子网络，用于在实际数据包流量之前快速转发预期的流量信息。因此，允许路由器预先启动并执行仲裁和信道分配。该技术已成功应用于属于两个不同类别的两种拓扑-变体树状网格（MoT）结构和2D网格拓扑。在各种流量模式下都观察到了相当稳定的延迟改进，同时吞吐量得到了适度提高。第二，将高性能，低功耗的异步NoC路由器首次与领先的商用同步对等路由器进行了比较。先进的工业技术。异步路由器设计显示出显着的性能改进，以及面积和功耗的节省。拟议中的异步路由器集成了多种先进技术，包括用于缓冲区设计的低延迟循环FIFO和新颖的端到端基于信用的虚拟通道（VC）流控制。此外，还创建了一个半自动化的设计流程，该流程使用了部分标准同步工具流程。最后，开发了一种高性能的多资源异步仲裁器设计。这个很小但是很重要的组件可以直接用于现有的异步NoC中以优化性能。此外，这种独立的设计有望用于开放新的NoC方向，以及用于并行系统中。在提出的仲裁器设计中，将资源分配给客户端分为几个步骤。可以按流水线顺序快速选择多个连续的客户端-资源对，并且分配的完成可以并行重叠。;总之，本文为异步和GALS片上网络的性能优化提供了一套高级设计解决方案。。从网络协议到路由器和组件级优化，这些解决方案处于不同的级别，可以直接应用于现有的基本异步NoC设计，以实现性能的飞跃。

著录项

作者
Jiang, Weiwei.;
展开▼
作者单位

Columbia University.;

展开▼
授予单位 Columbia University.;
学科 Computer science.
学位 Ph.D.
年度 2018
页码 214 p.
总页数 214
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A DVFS Cycle Accurate Simulation Framework with Asynchronous NoC Design for Power-Performance Optimizations [J] . Zoni Davide, Terraneo Federico, Fornaciari William Journal of signal processing systems for signal, image, and video technology . 2016,第3期

机译：具有异步NoC设计的DVFS周期精确仿真框架，用于功率性能优化
2. Gaussian-based optical networks-on-chip: Performance analysis and optimization [J] . Song Tingting, Xie Yiyuan, Ye Yichen, Nano communication networks . 2020,第May期

机译：基于高斯的光网络：性能分析和优化
3. Performance Optimization and Evaluation for Torus-Based Optical Networks-on-Chip [J] . Xie Yiyuan, Xu Weihua, Zhao Weilun, Lightwave Technology, Journal of . 2015,第18期

机译：基于环面的片上光网络的性能优化和评估
4. Implementation of a Design-for-Test Architecture for Asynchronous Networks-on-Chip [C] . Tran, X.-T., Durupt, . 2007

机译：片上异步网络的测试设计架构的实现
5. Design and Optimization of Networks-on-Chip for Future Heterogeneous Systems-on-Chip. [D] . Yoon, Young Jin. 2017

机译：未来异构系统级芯片上网络的设计和优化。
6. Crosstalk Analysis and Performance Evaluation for Torus-Based Optical Networks-on-Chip Using WDM [O] . Tingting Song, Yiyuan Xie, Yichen Ye, 2020

机译：使用WDM对芯片的芯片的串扰分析和性能评估
7. A DVFS Cycle Accurate Simulation Framework with Asynchronous NoC Design for Power-Performance Optimizations [O] . Davide Zoni, Federico Terraneo, William Fornaciari 2015

机译：具有异步NoC设计的DVFS周期精确仿真框架，用于功率性能优化
8. General Approach to Performance Analysis and Optimization of Asynchronous Circuits. [R] . Lee, T. K. 1995

机译：异步电路性能分析与优化的一般方法。

Design and Performance Optimization of Asynchronous Networks-on-Chip

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅