首页> 外文会议>IEEE International Conference on Cluster Computing >Exploring Data Migration for Future Deep-Memory Many-Core Systems
【24h】

Exploring Data Migration for Future Deep-Memory Many-Core Systems

机译:探索未来深存储器多核系统的数据迁移

获取原文
获取外文期刊封面目录资料

摘要

Upcoming high-performance computing (HPC) platforms will have more complex memory hierarchies with high-bandwidth on-package memory and in the future also non-volatile memory. How to use such deep memory hierarchies effectively remains an open research question. In this paper we evaluate the performance implications of a scheme based on a software-managed scratchpad with coarse-grained memory-copy operations migrating application data structures between memory hierarchy levels. We expect that such a scheme can, under specificcircumstances, outperform a hardware-managed cache while requiring a lot less effort than would a scheme managed entirely by the application programmers. Because suitable hardware is not yet generally available, we propose and benchmark several existing hardware configurations that can be used as approximations, including non-uniform memory access (NUMA) systems and memory on accelerators. We then evaluate data migration mechanisms currently available on Linux systems, such as move_pages and memcpy. We also design a best-case-scenario HPC benchmark to explore how the memory locality and parallelism of applications can be improved by data migration. We find that NUMA systems can be a reasonable approximation platform, especially when auxiliary load mechanisms are employed. Memory migration mechanisms inside the Linux kernel turn out to significantly lag behind a plain user-space memory copy, even after we level the playing field as much as possible. Our dedicated application benchmark demonstrates a significant performance benefit of doing memory migrations-approaching the measured difference in the memory bandwidth-provided that the ratio of worker threads to migration threads is chosen well.
机译:即将到来的高性能计算(HPC)平台将具有更复杂的存储器层次结构,其中包括高带宽的封装内存储器以及将来的非易失性存储器。如何有效地使用这种深层存储层次结构仍然是一个开放的研究问题。在本文中,我们评估了基于软件管理的暂存器的方案对性能的影响,该暂存器具有在内存层次结构级别之间迁移应用程序数据结构的粗粒度内存复制操作。我们期望在特定情况下,这种方案可以比硬件管理的缓存性能更好,同时所需的工作量也要比完全由应用程序程序员管理的方案少得多。由于尚无法普遍获得合适的硬件,因此我们建议并基准化了几种可以用作近似值的现有硬件配置,包括非均匀内存访问(NUMA)系统和加速器上的内存。然后,我们评估Linux系统上当前可用的数据迁移机制,例如move_pages和memcpy。我们还设计了一个最佳方案的HPC基准测试,以探索如何通过数据迁移来改善应用程序的内存局部性和并行性。我们发现NUMA系统可以是一个合理的近似平台,尤其是在采用辅助负载机制的情况下。事实证明,即使我们尽可能公平地竞争环境,Linux内核内部的内存迁移机制也明显落后于普通的用户空间内存副本。我们专用的应用程序基准测试证明了进行内存迁移的显着性能优势-接近测量的内存带宽差异-假设选择了工作线程与迁移线程的比率即可。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号