Cache Line Aware Algorithm Design for Cache-Coherent Architectures

Sabela Ramos; Torsten Hoefler

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Cache Line Aware Algorithm Design for Cache-Coherent Architectures

【24h】

Cache Line Aware Algorithm Design for Cache-Coherent Architectures

机译：高速缓存一致性体系结构的高速缓存行感知算法设计

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The increase in the number of cores per processor and the complexity of memory hierarchies make cache coherence key for programmability of current shared memory systems. However, ignoring its detailed architectural characteristics can harm performance significantly. In order to assist performance-centric programming, we propose a methodology to allow semi-automatic performance tuning with the systematic translation from an algorithm to an analytic performance model for cache line transfers. For this, we design a simple interface for cache line aware optimization, a translation methodology, and a full performance model that exposes the block-based design of caches to middleware designers. We investigate two different architectures to show the applicability of our techniques and methods: the many-core accelerator Intel Xeon Phi and a multi-core processor with a NUMA configuration (Intel Sandy Bridge). We use mathematical optimization techniques to tune synchronization algorithms to the microarchitectures, identifying three techniques to design and optimize data transfers in our model: single-use, single-step broadcast, and private cache lines.

机译：每个处理器的内核数量的增加以及内存层次结构的复杂性，使高速缓存一致性成为当前共享内存系统可编程性的关键。但是，忽略其详细的体系结构特征可能会严重损害性能。为了协助以性能为中心的编程，我们提出了一种方法，该方法允许半自动性能调整，以及从算法到缓存行传输的分析性能模型的系统转换。为此，我们设计了一个用于缓存行感知优化的简单界面，一种转换方法以及一个完整的性能模型，该模型向中间件设计人员公开了基于块的缓存设计。我们研究了两种不同的体系结构，以展示我们的技术和方法的适用性：多核加速器Intel Xeon Phi和具有NUMA配置的多核处理器（Intel Sandy Bridge）。我们使用数学优化技术将同步算法调整到微体系结构，确定了设计和优化模型中数据传输的三种技术：一次性使用，单步广播和专用缓存行。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2016年第10期|2824-2837|共14页
作者
Sabela Ramos; Torsten Hoefler;
展开▼
作者单位

Scalable Parallel Computing Lab, Computer Science Department, ETH Zürich, Switzerland;

Scalable Parallel Computing Lab, Computer Science Department, ETH Zürich, Switzerland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Coherence; Algorithm design and analysis; Sockets; Bridges; Protocols; Mathematical model; Analytical models;

机译：一致性;算法设计与分析;套接字;桥梁;协议;数学模型;分析模型;

相似文献

外文文献
中文文献
专利

1. On the Design and Evaluation of a Real-Time Operating System for Cache-Coherent Multicore Architectures [J] . Giovani Gracioli, Antonio Augusto Froehlich Operating systems review . 2015,第2期

机译：高速缓存一致性多核体系结构实时操作系统的设计与评估
2. HIGH PERFORMANCE FFT ALGORITHMS FOR CACHE-COHERENT MULTIPROCESSORS [J] . Kevin R. Wadleigh International Journal of High Performance Computing Applications . 1999,第2期

机译：高速缓存相干多处理器的高性能FFT算法
3. High-Endurance Hybrid Cache Design in CMP Architecture With Cache Partitioning and Access-Aware Policies [J] . Lin Ing-Chao, Chiou Jeng-Nian Very Large Scale Integration (VLSI) Systems, IEEE Transactions on . 2015,第10期

机译：具有高速缓存分区和访问感知策略的CMP体系结构中的高耐久性混合高速缓存设计
4. Cache-aware SPM allocation algorithms for hybrid SPM-cache architectures [C] . Lan Wu, Wei Zhang International Symposium on Quality Electronic Design . 2015

机译：用于混合SPM缓存体系结构的缓存感知SPM分配算法
5. Low power and process variation aware SRAM and Cache design fault tolerance in SRAM circuit, architecture and organization. [D] . Sasan, Avesta. 2010

机译：低功耗和工艺变化感知型SRAM和Cache在SRAM电路，架构和组织中的设计容错能力。
6. Exploration of a Capability-Focused Aerospace System of Systems Architecture Alternative with Bilayer Design Space Based on RST-SOM Algorithmic Methods [O] . Zhifei Li, Dongliang Qin, Feng Yang -1

机译：基于RST-SOM算法的以双层设计空间为中心的以系统架构替代能力为重点的航空航天系统的探索
7. Transformations of mutual exclusion algorithms from the cache-coherent model to the distributed shared memory model [O] . Hyonho Lee 2005

机译：互斥算法从缓存一致模型到分布式共享内存模型的转换

Cache Line Aware Algorithm Design for Cache-Coherent Architectures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅