首页> 外文会议>IEEE International Conference on Data Engineering >SF-sketch: A Fast, Accurate, and Memory Efficient Data Structure to Store Frequencies of Data Items
【24h】

SF-sketch: A Fast, Accurate, and Memory Efficient Data Structure to Store Frequencies of Data Items

机译:SF-sketch:一种快速,准确且内存有效的数据结构,用于存储数据项的频率

获取原文

摘要

A sketch is a probabilistic data structure that is used to record frequencies of items in a multi-set. Sketches have been applied in a variety of fields, such as data stream processing, natural language processing, distributed data sets etc. In this paper, we propose a new sketch, called Slim-Fat (SF) sketch, which has a much smaller memory footprint for query while supporting updates. The key idea behind our proposed SF-sketch is to maintain two separate sketches: a small sketch called Slimsubsketch and a large sketch called Fat-subsketch. The Slimsubsketch enables fast and accurate querying. The Fat-subsketch is used to assist the insertion and deletion from Slim-subsketch. We implemented and evaluated SF-sketch along with several prior sketches and compared them side by side. Our experimental results show that SF-sketch significantly outperforms the most commonly used CM-sketch in terms of accuracy. The full version is provided at arXiv.org [12].
机译:草图是一种概率数据结构,用于记录多集中项目的频率。草图已应用于各种领域,例如数据流处理,自然语言处理,分布式数据集等。在本文中,我们提出了一种新的草图,称为Slim-Fat(SF)草图,其内存要小得多。支持更新的同时查询的足迹。我们提议的SF草图背后的关键思想是维护两个单独的草图:一个称为Slimsubsketch的小草图和一个称为Fat-subsketch的大草图。 Slimsubsketch支持快速准确的查询。 Fat-subsketch用于辅助Slim-subsketch的插入和删除。我们实施并评估了SF草图以及几个先前的草图,并进行了比较。我们的实验结果表明,就准确性而言,SF草图明显优于最常用的CM草图。完整版本在arXiv.org [12]中提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号