首页> 外文期刊>ACM Transactions on Storage >CGraph: A Distributed Storage and Processing System for Concurrent Iterative Graph Analysis Jobs
【24h】

CGraph: A Distributed Storage and Processing System for Concurrent Iterative Graph Analysis Jobs

机译:CGHAGH:用于并发迭代图分析作业的分布式存储和处理系统

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Distributed graph processing platforms usually need to handle massive Concurrent iterative Graph Processing (CGP) jobs for different purposes. However, existing distributed systems face high ratio of data access cost to computation for the CGP jobs, which incurs low throughput. We observed that there are strong spatial and temporal correlations among the data accesses issued by different CGP jobs, because these concurrently running jobs usually need to repeatedly traverse the shared graph structure for the iterative processing of each vertex. Based on this observation, this article proposes a distributed storage and processing system CGraph for the CGP jobs to efficiently handle the underlying static/evolving graph for high throughput. It uses a data-centric load-trigger-pushing model, together with several optimizations, to enable the CGP jobs to efficiently share the graph structure data in the cache/memory and their accesses by fully exploiting such correlations, where the graph structure data is decoupled from the vertex state associated with each job. It can deliver much higher throughput for the CGP jobs by effectively reducing their average ratio of data access cost to computation. Experimental results show that CGraph improves the throughput of the CGP jobs by up to 3.47x in comparison with existing solutions on distributed platforms.
机译:分布式图形处理平台通常需要处理不同目的的大规模并发迭代图处理(CGP)作业。然而,现有的分布式系统面临高比率为CGP作业的数据访问成本比率,这会引起低吞吐量。我们观察到不同CGP作业发出的数据访问之间存在强烈的空间和时间相关性,因为这些同时运行的作业通常需要重复地遍历共享图形结构以获取每个顶点的迭代处理的共享图形结构。基于此观察,本文提出了一种用于CGP作业的分布式存储和处理系统CGHAGH,以有效地处理高吞吐量的底层静态/演化图。它使用以数据为中心的负载触发推送模型以及多个优化,使CGP作业能够通过充分利用图形结构数据的相关性,有效地共享高速缓存/内存中的图形结构数据及其访问从与每个作业相关联的顶点状态解耦。通过有效地降低数据访问成本与计算的平均比率,它可以为CGP工作提供更高的吞吐量。实验结果表明,与分布式平台上的现有解决方案相比,CGGUP提高了CGP工作的吞吐量高达3.47倍。

著录项

  • 来源
    《ACM Transactions on Storage》 |2019年第2期|共26页
  • 作者单位

    Huazhong Univ Sci &

    Technol Sch Comp Sci &

    Technol Natl Engn Res Ctr Big Data Technol &

    Syst Serv Comp Technol &

    Syst Lab Cluster &

    Grid Comp Wuhan 430074 Hubei Peoples R China;

    Huazhong Univ Sci &

    Technol Sch Comp Sci &

    Technol Natl Engn Res Ctr Big Data Technol &

    Syst Serv Comp Technol &

    Syst Lab Cluster &

    Grid Comp Wuhan 430074 Hubei Peoples R China;

    Huazhong Univ Sci &

    Technol Sch Comp Sci &

    Technol Natl Engn Res Ctr Big Data Technol &

    Syst Serv Comp Technol &

    Syst Lab Cluster &

    Grid Comp Wuhan 430074 Hubei Peoples R China;

    Huazhong Univ Sci &

    Technol Sch Comp Sci &

    Technol Natl Engn Res Ctr Big Data Technol &

    Syst Serv Comp Technol &

    Syst Lab Cluster &

    Grid Comp Wuhan 430074 Hubei Peoples R China;

    Huazhong Univ Sci &

    Technol Sch Comp Sci &

    Technol Natl Engn Res Ctr Big Data Technol &

    Syst Serv Comp Technol &

    Syst Lab Cluster &

    Grid Comp Wuhan 430074 Hubei Peoples R China;

    Huazhong Univ Sci &

    Technol Sch Comp Sci &

    Technol Natl Engn Res Ctr Big Data Technol &

    Syst Serv Comp Technol &

    Syst Lab Cluster &

    Grid Comp Wuhan 430074 Hubei Peoples R China;

    Natl Univ Singapore Dept Comp Sci Singapore Singapore;

    Univ Warwick Dept Comp Sci Coventry W Midlands England;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 存贮器;
  • 关键词

    Data access correlations; data access cost; throughput;

    机译:数据访问相关;数据访问成本;吞吐量;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号