首页> 外文会议>IEEE International System on Chip Conference >FPGA Based Co-design of Storage-side Query Filter for Big Data Systems
【24h】

FPGA Based Co-design of Storage-side Query Filter for Big Data Systems

机译:基于FPGA的大数据量协同查询系统设计

获取原文

摘要

In this paper we are interested in accelerating the processing of big data systems. We consider the architecture of storage and computing separated Big Data systems, and approach to improve the data query efficiency in the storage side. We propose an Field Programmable Gate Array (FPGA) based co-design of query filter on storage nodes to reduce the workloads of computing nodes and the communication overheads between them. The codesign of query filter is composed of software layer and FPGA layer. In software layer, we use the pointers to project the data in the RCFile format to reduce data transmission, and then formulate the combined predicate of SQL conditions into parameters. In FPGA layer, we design two filtering schemes on FPGA for data in RCFile format, i.e. parallel sequential filter and parallel pipeline filter, by which we can achieve that different columns and SQL queries are completely parallel. Based on TPC-H benchmark and Tencent data set, we conduct extensive experiments to evaluate our design, which can save averagely 76.2% of time overhead compared with Presto and 96.86% of time overhead compared with Hive.
机译:在本文中,我们感兴趣的是加速大数据系统的处理。我们考虑了存储和计算分离的大数据系统的体系结构,并在存储方面提高了数据查询效率。我们提出了一种基于现场可编程门阵列(FPGA)的存储节点查询滤波器协同设计方案,以减少计算节点的工作量和它们之间的通信开销。查询滤波器的协同设计由软件层和FPGA层组成。在软件层,我们使用指针将数据以RCFile格式投影,以减少数据传输,然后将SQL条件的组合谓词表示为参数。在FPGA层,我们在FPGA上对RCFile格式的数据设计了两种滤波方案,即并行顺序滤波和并行流水线滤波,实现了不同列和SQL查询的完全并行。基于TPC-H benchmark和腾讯数据集,我们进行了大量实验来评估我们的设计,与Presto相比,平均节省76.2%的时间开销,与Hive相比,平均节省96.86%的时间开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号