首页> 外文期刊>Computers, IEEE Transactions on >FlinkCL: An OpenCL-Based In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data
【24h】

FlinkCL: An OpenCL-Based In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data

机译:FlinkCL:用于大数据的异构CPU-GPU群集上的基于OpenCL的内存计算体系结构

获取原文
获取原文并翻译 | 示例

摘要

Research on in-memory big data management and processing has been prompted by the increase in main memory capacity and the explosion in big data. By offering an efficient in-memory distributed execution model, existing in-memory cluster computing platforms such as Flink and Spark have been proven to be outstanding for processing big data. This paper proposes FlinkCL, an in-memory computing architecture on heterogeneous CPU-GPU clusters based on OpenCL that enables Flink to utilize GPU's massive parallel processing ability. Our proposed architecture utilizes four techniques: a heterogeneous distributed abstract model (HDST), a Just-In-Time (JIT) compiling schema, a hierarchical partial reduction (HPR) and a heterogeneous task management strategy. Using FlinkCL, programmers only need to write Java code with simple interfaces. The Java code can be compiled to OpenCL kernels and executed on CPUs and GPUs automatically. In the HDST, a novel memory mapping scheme is proposed to avoid serialization or deserialization between Java Virtual Machine (JVM) objects and OpenCL structs. We have comprehensively evaluated FlinkCL with a set of representative workloads to show its effectiveness. Our results show that FlinkCL improve the performance by up to$11 imes$for some computationally heavy algorithms and maintains minor performance improvements for a I/O bound algorithm.
机译:主内存容量的增加和大数据的爆炸性发展促进了对内存中大数据管理和处理的研究。通过提供有效的内存分布式执行模型,事实证明,现有的内存集群计算平台(例如Flink和Spark)在处理大数据方面非常出色。本文提出了FlinkCL,它是一种基于OpenCL的异构CPU-GPU集群上的内存计算架构,它使Flink能够利用GPU的大规模并行处理能力。我们提出的体系结构利用了四种技术:异构分布式抽象模型(HDST),即时(JIT)编译方案,分层部分约简(HPR)和异构任务管理策略。使用FlinkCL,程序员只需编写具有简单接口的Java代码即可。可以将Java代码编译为OpenCL内核,并自动在CPU和GPU上执行。在HDST中,提出了一种新颖的内存映射方案,以避免Java虚拟机(JVM)对象和OpenCL结构之间的序列化或反序列化。我们使用一组代表性的工作负载对FlinkCL进行了全面评估,以显示其有效性。我们的结果表明FlinkCL最多可提高 n $ 11 times $ n用于某些计算量大的算法,并且对I / O绑定算法保持较小的性能改进。

著录项

  • 来源
    《Computers, IEEE Transactions on》 |2018年第12期|1765-1779|共15页
  • 作者单位

    College of Information Science and Engineering, National Supercomputing Center in Changsha, Hunan University, Changsha, Hunan, China;

    College of Information Science and Engineering, National Supercomputing Center in Changsha, Hunan University, Changsha, Hunan, China;

    Department of Information Engineering, Zunyi Normal College, Zunyi, Guizhou, China;

    College of Information Science and Engineering, National Supercomputing Center in Changsha, Hunan University, Changsha, Hunan, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Graphics processing units; Java; Big Data; Memory; Task analysis;

    机译:图形处理单元;Java;大数据;内存;任务分析;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号