首页> 外文期刊>World Wide Web >FID-sketch: an accurate sketch to store frequencies in data streams
【24h】

FID-sketch: an accurate sketch to store frequencies in data streams

机译:FID草图:将频率存储在数据流中的准确草图

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Sketches are being extensively used in a large number of real world applications to estimate frequencies of data items. Due to the unprecedented increase in the amount of Internet data and a relatively slower increase in the size of on-chip memories, existing sketches are becoming increasingly unable to keep the accuracy of the frequency estimates at an acceptable level. In this paper, we design a new sketch, called FID-sketch, that has a significantly higher accuracy and a much smaller on-chip memory footprint compared to the existing sketches. The key intuition behind the design of the FID-sketch is that before inserting an item, unlike prior sketches, it first estimates the current value of the frequency of that item stored in the sketch, and then increments as few counters as possible instead of incrementing a pre-determined fixed number of counters. We carried out extensive experiments to evaluate and compare the performance of FID-sketch with existing sketches on multi-core CPU and GPU platforms. Our experimental results show that our FID-sketch significantly outperforms the state-of-the-art with 36.7 times lower relative error. We have released the source code of our proposed sketch and other related sketches that we implemented at Github [21].
机译:草图已在许多实际应用中广泛使用,以估计数据项的频率。由于Internet数据量的空前增长和片上存储器大小的相对较慢的增长,现有的草图越来越无法将频率估计的准确性保持在可接受的水平。在本文中,我们设计了一个新的草图,称为FID草图,与现有草图相比,它具有更高的准确性和更小的片上内存占用量。 FID草图设计背后的主要直觉是,在插入项目之前,与以前的草图不同,它首先估计草图中存储的该项目的频率的当前值,然后递增尽可能少的计数器而不是递增预定数量的固定计数器。我们进行了广泛的实验,以评估和比较FID速写的性能与多核CPU和GPU平台上的现有草图。我们的实验结果表明,我们的FID草图明显优于最新技术,相对误差降低了36.7倍。我们已经发布了我们建议的草图以及我们在Github [21]中实现的其他相关草图的源代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号