首页> 外文会议>International Symposium on Applied Reconfigurable Computing >Fast Approximation of the Top-k Items in Data Streams Using a Reconfigurable Accelerator
【24h】

Fast Approximation of the Top-k Items in Data Streams Using a Reconfigurable Accelerator

机译:使用可重构的加速器快速近似数据流中的顶部K项

获取原文

摘要

This paper presents a novel method for finding the top-k items in data streams using a reconfigurable accelerator. The accelerator is capable of extracting an approximate list of the topmost frequently occurring items in an input stream, which is only scanned once without the need for random-access. The accelerator is based on a hardware architecture that implements the well-known Probabilistic sampling algorithm by mapping its main processing stages to two custom systolic arrays. The proposed architecture is the first hardware implementation of this algorithm, which shows better scalability compared to other architectures that are based on other stream algorithms. When implemented on an Intel Arria 10 FPGA (10AX115N2F45E1SG), 50% of the FPGA chip is sufficient for 3000+ Processing Elements (PEs). Experimental results on both synthetic and real input datasets showed very good accuracy and significant throughput gains compared to existing solutions. With achieved throughputs exceeding 300 Million items/s, we report average speedups of 20x compared to typical software implementations, 1.5x compared to GPU-accelerated implementations, and 1.8x compared to the fastest FPGA implementation.
机译:本文介绍了使用可重构的加速器在数据流中查找顶-K项的新方法。加速器能够在输入流中提取最顶层最常见的项目的近似列表,这仅扫描一次,而无需随机接入。加速器基于硬件架构,其通过将其主要处理阶段映射到两个自定义收缩阵列来实现众所周知的概率采样算法。所提出的架构是该算法的第一个硬件实现,其与基于其他流算法的其他架构相比,该算法显示了更好的可伸缩性。当在Intel Arria 10 FPGA(10AX115N2F45E1SG上),50%的FPGA芯片足以3000+处理元件(PE)。与现有解决方案相比,合成和实际输入数据集的实验结果显示出非常好的准确性和显着的吞吐量收益。随着吞吐量超过3亿物品,我们向典型的软件实现相比,与GPU加速的实现相比,将平均速度为20倍,与最快的FPGA实现相比,1.5倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号