首页> 外文会议>IEEE International Conference on Parallel and Distributed Systems >Evaluation of Flash-Based Out-of-Core Stencil Computation Algorithms for SSD-Equipped Clusters
【24h】

Evaluation of Flash-Based Out-of-Core Stencil Computation Algorithms for SSD-Equipped Clusters

机译:配备SSD的群集的基于Flash的核外模板计算算法的评估

获取原文

摘要

This paper proposes a new scheme for solving data size requirements for a large-scale stencil computation, which are greater than the total size of the main memories of nodes in a cluster. It utilizes distributed flash SSDs over cluster nodes as an extension to the main memory with a locality-aware algorithm. Three algorithms with a different hierarchical blocking scheme for three memory tiers, namely, flash SSD, DRAM, and cache, are proposed, and they are evaluated in different platforms and flash devices. They utilize not only highly parallel asynchronous input/output in flash SSDs, but also appropriate blocking parameters by using an auto-tuning system named Blk-Tune. They also overcome the performance degradation caused by the non-uniform memory architecture (NUMA). The optimized algorithms for single nodes are extended for multi-nodes and evaluated in a cluster with traditional SATA SSDs, as well as with state-of-the-art flash devices, such as low-power and cost-effective M.2 NVMe flash SSDs. With the use of our scheme and distributed flash devices in a cluster, large-scale stencil problems can be solved with a limited number of nodes and a moderate size of main memories.
机译:本文提出了一种新的解决方案,用于解决大规模模板计算的数据大小要求,该要求大于集群中节点主存储器的总大小。它利用群集感知算法将群集节点上的分布式闪存SSD用作主内存的扩展。提出了针对三种存储层(闪存SSD,DRAM和缓存)具有不同层次阻塞方案的三种算法,并在不同的平台和闪存设备中对它们进行了评估。它们不仅利用闪存SSD中的高度并行的异步输入/输出,还利用名为Blk-Tune的自动调整系统利用适当的阻塞参数。它们还克服了非均匀内存体系结构(NUMA)导致的性能下降。针对单节点的优化算法可扩展到多节点,并在具有传统SATA SSD以及最新的闪存设备(例如低功耗且经济高效的M.2 NVMe闪存)的群集中进行评估。固态硬盘。通过在群集中使用我们的方案和分布式闪存设备,可以使用数量有限的节点和适度的主内存来解决大规模模板问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号