首页> 外文期刊>International journal of parallel programming >Enabling Near-Data Accelerators Adoption by Through Investigation of Datapath Solutions
【24h】

Enabling Near-Data Accelerators Adoption by Through Investigation of Datapath Solutions

机译:通过调查DataPath解决方案来实现近数据加速器采用

获取原文
获取原文并翻译 | 示例

摘要

Processing-in-Memory (PIM) or Near-Data Accelerator (NDA) has been recently revisited to mitigate the issues of memory and power wall, mainly supported by the maturity of 3D-staking manufacturing technology, and the increasing demand for bandwidth and parallel data access in emerging processing-hungry applications. However, as these designs are naturally decoupled from main processors, at least three open issues must be tackled to allow the adoption of PIM: how to offload instructions from the host to NDAs, since many can be placed along memory; how to keep cache coherence between host and NDAs, and how to deal with the internal communication between different NDA units considering that NDAs can communicate to each other to better exploit their adoptions. In this work, we present an efficient design to solve these challenges. Based on the hybrid Host-Accelerator code, to provide fine-grain control, our design allows transparent offloading of NDA instructions directly from a host processor. Moreover, our design proposes a data coherence protocol, which includes an inclusion-policy agnostic cache coherence mechanism to share data between the host processor and the NDA units, transparently, and a protocol to allow communication between different NDA units. The proposed mechanism allows full exploitation of the experimented state-of-the-art design, achieving a speedup of up to 14.6× compared to a AVX architecture on PolyBench Suite, using, on average, 82% of the total time for processing and only 18% for the cache coherence and communication protocols.
机译:最近已经重新预订了内存(PIM)或近数据加速器(NDA)以减轻内存和电源墙的问题,主要由3D铆接制造技术的成熟度支持,以及对带宽和平行的不断增加的需求新兴加工饥饿的应用中的数据访问。然而,由于这些设计自然地与主处理器解耦,必须至少三个开放问题解决,以便采用PIM:如何将来自主机从主机卸载指令,因为许多人可以沿存储器置于NDA。如何在主机和NDA之间保存缓存一致性,以及如何应对不同NDA单位之间的内部通信,考虑到NDA可以互相通信以更好地利用其采用。在这项工作中,我们提出了一个有效的设计来解决这些挑战。基于混合主机加速器代码,提供细粒度控制,我们的设计允许直接从主处理器透明地卸载NDA指令。此外,我们的设计提出了一种数据一致性协议,包括包含概念不可知的高速缓存一致机制,以透明地,以及允许不同NDA单位之间进行通信的协议在主机处理器和NDA单元之间共享数据。该提出的机制允许充分利用实验最先进的设计,与多封套装上的AVX架构相比,使用平均加工总时间的AVX架构实现高达14.6倍的加速度高达14.6倍。高速缓存一致性和通信协议的18%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号