首页> 外文会议>International computing and combinatorics conference >How to Catch L_2-Heavy-Hitters on Sliding Windows
【24h】

How to Catch L_2-Heavy-Hitters on Sliding Windows

机译:如何在滑动窗口上捕捉L_2重击者

获取原文

摘要

Finding heavy-elements (heavy-hitters) in streaming data is one of the central, and well-understood tasks. Despite the importance of this problem, when considering the sliding windows model of streaming (where elements eventually expire) the problem of finding L_2-heavy elements has remained completely open despite multiple papers and considerable success in finding L_1-heavy elements. Since the L_2-heavy element problem doesn't satisfy certain conditions, existing methods for sliding windows algorithms, such as smooth histograms or exponential histograms are not directly applicable to it. In this paper, we develop the first polylogarithmic-memory algorithm for finding L_2-heavy elements in the sliding window model. Our technique allows us not only to find L_2-heavy elements, but also heavy elements with respect to any L_p with 0<p≤2 on sliding windows. By this we completely "close the gap" and resolve the question of finding L_p-heavy elements in the sliding window model with polylogarithmic memory, since it is well known that for p>2 this task is impossible. We demonstrate a broader applicability of our method on two additional examples: we show how to obtain a sliding window approximation of the similarity of two streams, and of the fraction of elements that appear exactly a specified number of times within the window (the α-rarity problem). In these two illustrative examples of our method, we replace the current expected memory bounds with worst case bounds.
机译:在流数据中查找重元素(繁重的操作)是一项重要且容易理解的任务。尽管存在这个问题的重要性,但是在考虑流的滑动窗口模型(元素最终失效的地方)时,尽管有很多论文并且在找到L_1重元素方面取得了很大的成功,但是找到L_2重元素的问题仍然完全存在。由于L_2重元素问题不满足某些条件,因此滑动窗口算法的现有方法(例如平滑直方图或指数直方图)不能直接应用于该方法。在本文中,我们开发了第一个用于在滑动窗口模型中查找L_2重元素的多对数内存算法。我们的技术不仅使我们能够找到L_2重元素,而且对于滑动窗口上0 <p≤2的任何L_p,也能找到重元素。这样,我们就完全“缩小了差距”,解决了在具有多对数记忆的滑动窗口模型中找到L_p重元素的问题,因为众所周知,对于p> 2而言,此任务是不可能的。我们在另外两个示例中展示了我们的方法的广泛适用性:我们展示了如何获得两个流相似度以及在窗口内恰好出现指定次数的元素比例(α-稀有问题)。在我们方法的这两个说明性示例中,我们用最坏情况下的边界替换了当前的预期内存边界。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号