首页> 外文期刊>Journal of supercomputing >Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams
【24h】

Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams

机译:Mille Cheval:基于GPU的内存高性能计算框架,用于加速处理大数据流

获取原文
获取原文并翻译 | 示例
           

摘要

Streams are temporally ordered, rapid changing, ample in volume, and infinite in nature. It is nearly impossible to store the entire data stream due to its large volume and high velocity. In this work, the principle of parallelism is employed to accelerate stream data computing. GPU-based high-performance computing (HPC) framework is proposed for accelerated processing of big-data streams using the in-memory data structure. We have implemented three parallel algorithms to prove the viability of the framework. The contributions of Mille Cheval are: (1) the viability of streaming on accelerators to increase throughput, (2) carefully chosen hash algorithms to achieve low collision rate and high randomness, and (3) memory sketches for approximation. The objective is to leverage the power of a single node using in-memory computing and hybrid computing. HPC does not always require high-end hardware but well-designed algorithms. Achievements of Mille Cheval are: (1) relative error is 1.32 when error rate and overestimate rate are chosen as 0.001 and (2) the host memory space requirement is just 63 MB for 1 terabyte of data. The proposed algorithms are pragmatic. It is evident from experimental results that the framework demonstrates 10X speed-up as compared with CPU implementations and 3X speed-up as compared with GPU implementations.
机译:溪流在时间上订购,快速变化,体积充足,并且在自然界中无穷无尽。由于其大容量和高速度,几乎不可能存储整个数据流。在这项工作中,采用平行性原理来加速流数据计算。基于GPU的高性能计算(HPC)框架,用于使用内存数据结构加速处理大数据流。我们已经实施了三种并行算法来证明框架的可行性。 Mille Cheval的贡献是:(1)流动者流媒体的可行性增加吞吐量,(2)精心选择的哈希算法,以实现低碰撞率和高随机性,以及(3)近似的记忆草图。目标是使用内存计算和混合计算来利用单个节点的功率。 HPC并不总是需要高端硬件但设计精心设计的算法。 Achievements of Mille Cheval are: (1) relative error is 1.32 when error rate and overestimate rate are chosen as 0.001 and (2) the host memory space requirement is just 63 MB for 1 terabyte of data.所提出的算法是务实的。从实验结果明显看出,与GPU实现相比,框架与CPU实现相比,框架展示了10倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号