L2-Cache Hierarchical Organizations for Multi-core Architectures

机译：适用于多核体系结构的L2缓存分层组织

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays the market is moving to have multiple cores on the same chip (Chip Multiprocessors - CMP) with a multi-sliced L2 which is shared by 2 cores. CMPs with 8 cores can already be found, and future CMPs will have more than 8 cores. Typical implementations of CMPs share the L2 cache among the processors and have 2 cores sharing the same L2. We are interested in investigating the behavior of the pair: L2 sharing × L2 cache size. So, we construct models of two different organizations of CMPs: (ⅰ) tiles, with LI and L2 private, interconnected through a router; (ⅱ) tiles with LI private and L2 shared among processors. The (ⅱ) organization is evaluated with different numbers (2, 4) of cores sharing the same L2 slice and also, the L2 shared slice size is changed (1 MB, 2MB and 4 MB). With a total number of 32 cores, the proposed configurations of (ii) organization are evaluated with a full-system simulation under SPLASH-2 benchmarks. By applying both techniques, results show that the execution time is improved of about 18.9% for Ocean, 88.8% for Raytrace,and 31.8% for Volrend.

机译：如今，市场正在朝着同一芯片（芯片多处理器-CMP）上具有多个内核的方向发展，该芯片具有由2个内核共享的多层L2。具有8个核心的CMP已经可以找到，将来的CMP将具有8个以上的核心。 CMP的典型实现在处理器之间共享L2缓存，并具有2个共享同一L2的内核。我们有兴趣调查该对的行为：L2共享×L2缓存大小。因此，我们构建了CMP的两个不同组织的模型：（ⅰ）具有L1和L2私有的图块，它们通过路由器互连。（ⅱ）具有处理器之间的L1专用和L2共享的图块。（ⅱ）组织使用共享同一L2片的不同数量（2、4）的内核进行评估，并且L2共享片的大小也发生了更改（1 MB，2MB和4 MB）。在SPLASH-2基准下，使用总共32个核心的（ii）组织的建议配置进行了全系统仿真评估。通过应用这两种技术，结果表明，Ocean的执行时间缩短了约18.9％，Raytrace的执行时间缩短了88.8％，Volrend的执行时间缩短了31.8％。

著录项

来源
《Frontiers of High Performance Computing and Networking - ISPA 2006 Workshops; Lecture Notes in Computer Science; 4331》|2006年|74-83|共10页
会议地点 Sorrento(IT);Sorrento(IT)
作者
Mario Donato Marino;
展开▼
作者单位

Computing Engineering Department- Polytechnic School - University of Sao Paulo;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. The Design and Evaluation of Hierarchical Multi-level Parallelisms for H.264 Encoder on Multi-core Architecture [J] . Haitao Wei, Junqing Yu, Jiang Li Computer Science and Information Systems . 2010,第1期

机译：H.264编码器在多核体系结构上的多层多级并行性设计和评估
2. HIGH LATENCY AND CONTENTION ON SHARED L2-CACHE FOR MANY-CORE ARCHITECTURES [J] . MARCO A. Z. ALVES‡ HENRIQUE C. FREITAS† and PHILIPPE O. A. NAVAUX Parallel Processing Letters . 2011,第1期

机译：多种架构的共享L2缓存的高延迟性和持久性
3. HIGH LATENCY AND CONTENTION ON SHARED L2-CACHE FOR MANY-CORE ARCHITECTURES [J] . MARCO A. Z. ALVES, HENRIQUE C. FREITAS, PHILIPPE O. A. NAVAUX Parallel Processing Letters . 2011,第1期

机译：多种架构的共享L2缓存的高延迟性和持久性
4. Investigation of L2-Cache Interferences in a NXP QorIQ T4240 Multi-core Processor [C] . Jonathan Fish, Alfred Bognar International conference on architecture of computing systems . 2019

机译：恩智浦QorIQ T4240多核处理器中的L2缓存干扰调查
5. Hierarchical Temporal Memory Cortical Learning Algorithm for Pattern Recognition on Multi-core Architectures. [D] . Price, Ryan William. 2011

机译：用于多核架构的模式识别的分层时间记忆皮质学习算法。
6. A Parallel Architecture for the Partitioning around Medoids (PAM) Algorithm for Scalable Multi-Core Processor Implementation with Applications in Healthcare [O] . Hassan Mushtaq, Sajid Gul Khawaja, Muhammad Usman Akram, 2018

机译：围绕Medoids（PAM）算法进行分区的并行体系结构可实现可扩展的多核处理器及其在医疗保健中的应用
7. The Design and Evaluation of Hierarchical Multilevel Parallelisms for H.264 Encoder on Multi-core Architecture [O] . Haitao Wei, Junqing Yu, Jiang Li 2014

机译：H.264编码器在多核体系结构上的多层多级并行性设计和评估

L2-Cache Hierarchical Organizations for Multi-core Architectures

摘要

著录项

相似文献

相关主题

期刊订阅