Inter-Core Locality Aware Memory Scheduling

Dongdong Li; Tor M. Aamodt

首页> 外文期刊>IEEE computer architecture letters >Inter-Core Locality Aware Memory Scheduling

【24h】

Inter-Core Locality Aware Memory Scheduling

机译：内核间位置感知内存调度

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Graphics Processing Units (GPUs) run thousands of parallel threads and achieve high Memory Level Parallelism (MLP). To support high Memory Level Parallelism, a structure called a Miss-Status Holding Register (MSHR) handles multiple in-flight miss requests. When multiple cores send requests to the same cache line, the requests are merged into one last level cache MSHR entry and only one memory request is sent to the Dynamic Random-Access Memory (DRAM). We call this inter-core locality. The main reason for inter-core locality is that multiple cores access shared read-only data within the same cache line. By prioritizing memory requests that have high inter-core locality, more threads resume execution. In this paper, we analyze the reason for inter-core locality and show that requests with inter-core locality are more critical to performance. We propose a GPU DRAM scheduler that exploits information about inter-core locality detected at the last level cache MSHRs. For high inter-core locality benchmarks this leads to an average 28 percent reduction in memory request latency and 11 percent improvement in performance.

机译：图形处理单元（GPU）运行数千个并行线程，并实现高内存级别并行性（MLP）。为了支持高内存级别并行性，一种称为未命中状态保持寄存器（MSHR）的结构可处理多个飞行中未命中请求。当多个内核将请求发送到同一高速缓存行时，这些请求将合并到一个最后一级的高速缓存MSHR条目中，并且只有一个内存请求被发送到动态随机存取内存（DRAM）。我们称之为核心间位置。内核间局部性的主要原因是多个内核访问同一缓存行内的共享只读数据。通过对内核间位置较高的内存请求进行优先级排序，更多线程将恢复执行。在本文中，我们分析了核心间局部性的原因，并表明具有核心间局部性的请求对于性能而言更为关键。我们提出了一种GPU DRAM调度程序，该程序利用有关在最后一级缓存MSHR处检测到的内核间位置的信息。对于高内核间位置基准，这将使内存请求延迟平均降低28％，而性能则提高11％。

著录项

来源
《IEEE computer architecture letters》 |2016年第1期|25-28|共4页
作者
Dongdong Li; Tor M. Aamodt;
展开▼
作者单位

Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GPGPU; Locality; Memory Access Scheduling; locality; memory access scheduling;

机译：GPGPU;局部性;内存访问调度;局部性;内存访问调度;

相似文献

外文文献
中文文献
专利

1. An efficient deadline constrained and data locality aware dynamic scheduling framework for multitenancy clouds [J] . Ru Jia, Yang Yun, Grundy John, Concurrency and computation: practice and experience . 2021,第5期

机译：一个有效的截止日期约束和数据局势意识到多租期云的动态调度框架
2. Performance analysis and optimality results for data-locality aware tasks scheduling with replicated inputs [J] . Olivier Beaumont, Thomas Lambert, Loris Marchal, Future generation computer systems . 2020,第Octa期

机译：数据局部地意识到与复制输入调度的数据局部意识任务的性能分析和最优性
3. Fine-grained data-locality aware MapReduce job scheduler in a virtualized environment [J] . Jeyaraj Rathinaraja, Ananthanarayana V. S., Paul Anand Journal of ambient intelligence and humanized computing . 2020,第10期

机译：虚拟化环境中的细粒度数据临时意识的MapReduce作业计划程序
4. Optimizing Memory Locality Using a Locality-Aware Page Table [C] . Cruz E.H.M., Diener M., Alves M.A.Z., International symposium on computer architecture and high performance computing . 2014

机译：使用位置感知页面表优化内存位置
5. A Scalable Locality-aware Adaptive Work-stealing Scheduler for Multi-core Task Parallelism. [D] . Guo, Yi. 2010

机译：用于多核任务并行性的可扩展的可感知位置的自适应工作窃取调度程序。
6. Impact study of data locality on task-based applications through the Heteroprio scheduler [O] . Bérenger Bramas 2019

机译：通过Heteropro调度程序对基于任务的应用程序的影响研究
7. Performance analysis and optimality results for data-locality aware tasks scheduling with replicated inputs [O] . Olivier Beaumont, Thomas Lambert, Loris Marchal, 2020

机译：数据局部地意识到与复制输入调度的数据局部意识任务的性能分析和最优性

Inter-Core Locality Aware Memory Scheduling

摘要

著录项

相似文献

相关主题

期刊订阅