首页> 外文会议>IEEE/ACM international symposium on cluster, cloud and grid computing >Partially Separated Page Tables for Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous Architectures
【24h】

Partially Separated Page Tables for Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous Architectures

机译:异构体系结构上高效操作系统辅助分层内存管理的部分分隔页表

获取原文

摘要

Heterogeneous architectures, where a multicore processor is accompanied with a large number of simpler, but more power-efficient CPU cores optimized for parallel workloads, are receiving a lot of attention recently. At present, these co-processors, such as the Intel Xeon Phi product family, come with limited on-board memory, which requires partitioning computational problems manually into pieces that can fit into the device's RAM, as well as efficiently overlapping computation and communication. In this paper we propose an application transparent, operating system (OS)assisted hierarchical memory management system, where the OS orchestrates data movement between the host and the device and updates the process virtual memory address space accordingly. We identify the main scalability issues of frequent address space changes, such as the increasing price of TLB invalidations with the growing number of CPU cores, and propose partially separated page tables with address-range CPU masks to overcome the problem. With partially separated page tables each core maintains its own set of mappings of the computation area, enabling the OS to perform address space updates in a scalable manner, and involve a particular CPU core in TLB invalidation only if it is absolutely necessary. Furthermore, we propose dedicated data movement cores in order to efficiently overlap computation and communication. We provide experimental results on stencil computation, a common HPCkernel, and show that OS assisted memory management has the potential for scalable transparent data movement.
机译:异构体系结构最近受到了很多关注,异构体系结构中多核处理器与大量针对并行工作负载进行了优化的更简单但更省电的CPU内核结合在一起。目前,这些协处理器(例如Intel Xeon Phi产品系列)具有有限的板载内存,这就要求将计算问题手动划分为可以放入设备RAM的部分,并有效地重叠计算和通信。在本文中,我们提出了一种应用程序透明的,操作系统(OS)辅助的分层内存管理系统,其中OS协调主机和设备之间的数据移动并相应地更新进程虚拟内存地址空间。我们确定了频繁更改地址空间的主要可伸缩性问题,例如随着CPU内核数量的增加而导致TLB失效的价格上涨,并提出了具有地址范围CPU掩码的部分分离的页表来解决该问题。通过部分分隔的页表,每个内核都维护自己的计算区域映射集,从而使OS能够以可伸缩的方式执行地址空间更新,并且仅在绝对必要时才在TLB无效中包含特定的CPU内核。此外,我们提出了专用的数据移动核心,以便有效地重叠计算和通信。我们提供了模板计算(一种常见的HPCkernel)的实验结果,并表明OS辅助的内存管理具有可伸缩的透明数据移动的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号