首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Codesign of NoC and Cache Organization for Reducing Access Latency in Chip Multiprocessors
【24h】

Codesign of NoC and Cache Organization for Reducing Access Latency in Chip Multiprocessors

机译:NoC和缓存组织的共同设计,用于减少芯片多处理器中的访问延迟

获取原文
获取原文并翻译 | 示例

摘要

Reducing data access latency is vital to achieving performance improvements in computing. For chip multiprocessors (CMPs), data access latency depends on the organization of the memory hierarchy, the on-chip interconnect, and the running workload. Several network-on-chip (NoC) designs exploit communication locality to reduce communication latency by configuring special fast paths or circuits on which communication is faster than the rest of the NoC. However, communication patterns are directly affected by the cache organization and many cache organizations are designed in isolation of the underlying NoC or assume a simple NoC design, thus possibly missing optimization opportunities. In this work, we take a codesign approach of the NoC and cache organization. First, we propose a hybrid circuit/packet-switched NoC that exploits communication locality through periodic configuration of the most beneficial circuits. Second, we design a Unique Private (UP) caching scheme targeting the class of interconnects which exploit communication locality to improve communication latency. The Unique Private cache stores the data that are mostly accessed by each processor core in the core's locally accessible cache bank, while leveraging dedicated high-speed circuits in the interconnect to provide remote cores with fast access to shared data. Simulations of a suite of scientific and commercial workloads show that our proposed design achieves a speedup of 15.2 and 14 percent on a 16-core and a 64-core CMP, respectively, over the state-of-the-art NoC-Cache codesigned system that also exploits communication locality in multithreaded applications.
机译:减少数据访问延迟对于提高计算性能至关重要。对于芯片多处理器(CMP),数据访问延迟取决于存储器层次结构的组织,片上互连以及正在运行的工作负载。几种片上网络(NoC)设计通过配置特殊的快速路径或电路(其通信速度比NoC的其余部分更快)来利用通信局部性来减少通信延迟。但是,通信模式直接受缓存组织的影响,许多缓存组织的设计是与底层NoC隔离的,或者采用简单的NoC设计,因此可能会缺少优化机会。在这项工作中,我们采用NoC和缓存组织的代码签名方法。首先,我们提出了一种混合电路/分组交换NoC,它通过对最有利电路的周期性配置来利用通信局部性。其次,我们针对互连类别设计一种独特的专用(UP)缓存方案,该互连利用通信局部性来改善通信延迟。唯一专用高速缓存将大多数处理器核心访问的数据存储在内核的本地可访问高速缓存组中,同时利用互连中的专用高速电路为远程内核提供对共享数据的快速访问。对一组科学和商业工作负载的仿真表明,与最新的NoC-Cache编码系统相比,我们提出的设计在16核和64核CMP上分别实现了15.2%和14%的加速。它也利用多线程应用程序中的通信局部性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号