【24h】

Finding Hierarchical Heavy Hitters in Data Streams

机译:在数据流中查找分层重磅炸弹

获取原文
获取原文并翻译 | 示例

摘要

Aggregation along hierarchies is a critical summary technique in a large variety of online applications including decision support and network management (e.g., IP clustering, denial-of-service attack monitoring). Despite the amount of recent study that has been dedicated to online aggregation on sets (e.g., quantiles, hot items), surprisingly little attention has been paid to summarizing hierarchical structure in stream data. The problem we study in this paper is that of finding Hierarchical Heavy Hitters (HHH): given a hierarchy and a fraction φ, we want to find all HHH nodes that have a total number of descendants in the data stream no smaller than φ of the total number of elements in the data stream, after discounting the descendant nodes that are HHH nodes. The resulting summary gives a topological "cartogram" of the hierarchical data. We present deterministic and randomized algorithms for finding HHHs, which builds upon existing techniques by incorporating the hierarchy into the algorithms. Our experiments demonstrate several factors of improvement in accuracy over the straightforward approach, which is due to making algorithms hierarchy-aware.
机译:在包括决策支持和网络管理(例如IP群集,拒绝服务攻击监视)在内的各种在线应用程序中,沿着层次结构进行聚合是一种至关重要的汇总技术。尽管最近有大量研究致力于在线集合(例如分位数,热门项目)上的研究,但令人惊讶的是很少有人关注汇总流数据中的层次结构。我们在本文中研究的问题是找到分层重磅(HHH):给定一个层次结构和一个分数φ,我们要查找数据流中后代总数不小于Φ的φ的所有HHH节点。减去作为HHH节点的后代节点后,数据流中元素的总数。所得的摘要给出了层次结构数据的拓扑“图表”。我们提出用于确定HHH的确定性和随机算法,它是通过将层次结构合并到算法中而建立在现有技术的基础上的。我们的实验证明了比直接方法提高准确性的几个因素,这是由于使算法具有层次结构意识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号