...
首页> 外文期刊>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems >GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing
【24h】

GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing

机译:图表:用于大型图形处理的加工内存架构

获取原文
获取原文并翻译 | 示例
           

摘要

Large-scale graph processing requires the high bandwidth of data access. However, as graph computing continues to scale, it becomes increasingly challenging to achieve a high bandwidth on generic computing architectures. The primary reasons include: the random access pattern causing local bandwidth degradation, the poor locality leading to unpredictable global data access, heavy conflicts on updating the same vertex, and unbalanced workloads across processing units. Processing-in-memory (PIM) has been explored as a promising solution to providing high bandwidth, yet open questions of graph processing on PIM devices remain in: 1) how to design hardware specializations and the interconnection scheme to fully utilize bandwidth of PIM devices and ensure locality and 2) how to allocate data and schedule processing flow to avoid conflicts and balance workloads. In this paper, we propose GraphH, a PIM architecture for graph processing on the hybrid memory cube array, to tackle all four problems mentioned above. From the architecture perspective, we integrate SRAM-based on-chip vertex buffers to eliminate local bandwidth degradation. We also introduce reconfigurable double-mesh connection to provide high global bandwidth. From the algorithm perspective, partitioning and scheduling methods like index mapping interval-block and round interval pair are introduced to GraphH, thus workloads are balanced and conflicts arc avoided. Two optimization methods are further introduced to reduce synchronization overhead and reuse on-chip data. The experimental results on graphs with billions of edges demonstrate that GraphH outperforms DDR-based graph processing systems by up to two orders of magnitude and 5.12x speedup against the previous PIM design.
机译:大规模图形处理需要数据访问的高带宽。然而,随着图形计算继续规模,在通用计算架构上实现高带宽变得越来越具有挑战性。主要原因包括:随机访问模式导致本地带宽劣化,导致无法预测的全局数据访问的差的众多,更新在处理单元上更新相同的顶点和不平衡工作负载。正在探索内存(PIM)作为提供高带宽的有希望的解决方案,但PIM器件上的图形处理的打开问题仍然存在:1)如何设计硬件专业化和互连方案以充分利用PIM器件的带宽并确保局部性和2)如何分配数据和计划处理流以避免冲突和平衡工作负载。在本文中,我们提出了关于混合存储器立方体阵列上的图形处理的PIM架构,以解决上述所有四个问题。从架构的角度来看,我们将基于SRAM的上芯片缓冲区集成以消除本地带宽劣化。我们还介绍可重新配置的双网格连接,以提供高全局带宽。从算法的透视图,将索引映射间隔和圆间隔对等划分和调度方法被引入GraphH,因此工作负载是平衡的并且避免了冲突弧。进一步引入了两种优化方法以减少同步开销并重用片上数据。数十亿边缘的图表上的实验结果表明,Graphh以最多两个数量级和5.12倍加速,而是针对先前的PIM设计的比例突出地表现出基于DDR的图形处理系统。

著录项

  • 来源
  • 作者单位

    Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China|Tsinghua Univ Beijing Natl Res Ctr Informat Sci & Technol Beijing 100084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China|Tsinghua Univ Beijing Natl Res Ctr Informat Sci & Technol Beijing 100084 Peoples R China;

    Univ Calif Los Angeles Dept Comp Sci Los Angeles CA 90095 USA;

    Univ Calif San Diego Jacobs Sch Engn Comp Sci & Engn Dept La Jolla CA 92093 USA;

    Peking Univ Sch EECS Ctr Energy Efficient Comp & Applicat Beijing 100871 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China|Tsinghua Univ Beijing Natl Res Ctr Informat Sci & Technol Beijing 100084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China|Tsinghua Univ Beijing Natl Res Ctr Informat Sci & Technol Beijing 100084 Peoples R China;

    Univ Calif Santa Barbara Dept Elect & Comp Engn Santa Barbara CA 93106 USA;

    Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China|Tsinghua Univ Beijing Natl Res Ctr Informat Sci & Technol Beijing 100084 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Hybrid memory cube (HMC); large-scale graph processing; memory hierarchy; on-chip networks;

    机译:混合内存立方体(HMC);大规模图形处理;内存层次结构;片上网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号