首页> 外文会议>International Conference on Data Engineering >What's Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams
【24h】

What's Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams

机译:有什么不同:分布式,持续监控数据流上的重复弹性聚集体

获取原文
获取外文期刊封面目录资料

摘要

Emerging applications in sensor systems and network-wide IP traffic analysis present many technical challenges. They need distributed monitoring and continuous tracking of events. They have severe resource constraints not only at each site in terms of per-update processing time and archival space for highspeed streams of observations, but also crucially, communication constraints for collaborating on the monitoring task. These elements have been addressed in a series of recent works. A fundamental issue that arises is that one cannot make the "uniqueness" assumption on observed events which is present in previous works, since widescale monitoring invariably encounters the same events at different points. For example, within the network of an Internet Service Provider packets of the same flow will be observed in different routers; similarly, the same individual will be observed by multiple mobile sensors in monitoring wild animals. Aggregates of interest on such distributed environments must be resilient to duplicate observations. We study such duplicate-resilient aggregates that measure the extent of the duplication - how many unique observations are there, how many observations are unique - as well as standard holistic aggregates such as quantiles and- heavy hitters over the unique items. We present accuracy guaranteed, highly communication-efficient algorithms for these aggregates that work within the time and space constraints of high speed streams. We also present results of a detailed experimental study on both real-life and synthetic data.
机译:在传感器系统和网络范围内的IP流量分析目前许多技术挑战的新兴应用。他们需要分布式监控和事件的连续跟踪。他们不仅在每次更新处理时间和观测的高速流存档空间方面每个站点资源严重短缺,而且关键的是,通信约束的监测任务进行合作。这些元素已经在最近的一系列作品中得到解决。时发生的根本问题是一个不能对观察到的事件的“唯一性”的假设,它存在于以前的作品,因为widescale监测总是遇到的不同点相同的事件。例如,相同的流将在不同的路由器可以观察到的因特网服务提供商的数据包在网络内;类似地,相同的个体将被多个移动传感器监测野生动物中观察到。对这样的分布式环境的兴趣聚合必须是弹性的重复观测。我们研究的是测量复制的程度,例如重复弹性的集合体 - 许多独特的看法如何在那里,有多少个观察是独一无二的 - 以及标准全面骨料,比如在独特的项目位数和 - 重量级人物。我们本精度得到保证,高通信效率的算法对这些聚集体高速流的时间和空间的限制内工作。我们在两个现实生活和合成数据的详细的实验研究也存在结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号