首页> 外文会议>IEEE International Symposium on Network Computing and Applications >Efficiently Summarizing Data Streams over Sliding Windows
【24h】

Efficiently Summarizing Data Streams over Sliding Windows

机译:在滑动窗口上有效地总结数据流

获取原文

摘要

Estimating the frequency of any piece of information in large-scale distributed data streams became of utmost importance in the last decade (e.g., in the context of network monitoring, big data, etc.). If some elegant solutions have been proposed recently, their approximation is computed from the inception of the stream. In a runtime distributed context, one would prefer to gather information only about the recent past. This may be led by the need to save resources or by the fact that recent information is more relevant. In this paper, we consider the sliding window model and propose two different (on-line) algorithms that approximate the items frequency in the active window. More precisely, we determine a (ε, δ)-additive-approximation meaning that the error is greater than ε only with probability δ. These solutions use a very small amount of memory with respect to the size N of the window and the number n of distinct items of the stream, namely, {formula} and {formula} bits of space, where τ is a parameter limiting memory usage. We also provide their distributed variant, i.e., considering the sliding window functional monitoring model. We compared the proposed algorithms to each other and also to the state of the art through extensive experiments on synthetic traces and real data sets that validate the robustness and accuracy of our algorithms.
机译:估计大规模分布式数据流中的任何信息的频率在过去十年中最重要的是最重要的(例如,在网络监视,大数据等的上下文中)。如果最近提出了一些优雅的解决方案,则从流的开始计算它们的近似值。在运行时分布式上下文中,人们更愿意仅收集最近的信息。这可能是通过节省资源的需要或最近的信息更为相关的事实来引导。在本文中,我们考虑滑动窗口模型,并提出两个不同(在线)算法,该算法近似于活动窗口中的项目频率。更确切地说,我们确定(ε,δ) - 一种近似致象,其含义仅用概率δ大于ε。这些解决方案相对于窗口的尺寸N和流的不同项目的数量N,即{公式}和{公式}空间的数字,其中τ是参数限制内存使用情况的非常少量的存储器。我们还提供了其分布式变体,即,考虑滑动窗口功能监测模型。我们将所提出的算法与彼此的广泛实验和验证我们算法的鲁棒性和准确性的实际数据集进行了广泛的实验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号