首页> 外文会议>SIAM International Conference on Data Mining >Time-Decayed Correlated Aggregates over Data Streams
【24h】

Time-Decayed Correlated Aggregates over Data Streams

机译:数据流过度相关的聚集体

获取原文

摘要

Data stream analysis frequently relies on identifying correlations and posing conditional queries on the data after it has been seen. Correlated aggregates form an important example of such queries, which ask for an aggregation over one dimension of stream elements which satisfy a predicate on another dimension. Since recent events are typically more important than older ones, time decay should also be applied to down weight less significant values. We present space-efficient algorithms as well as space lower bounds for the time-decayed correlated sum, a problem at the heart of many related aggregations. By considering different fundamental classes of decay functions, we separate cases where efficient relative error or additive error is possible, from other cases where linear space is necessary to approximate. In particular, we show that no efficient algorithms are possible for the popular sliding window and exponential decay models, resolving an open problem. The results are surprising, since efficient approximations are known for other data stream problems under these decay models. This is a step towards better understanding which sophisticated queries can be answered on massive streams using limited memory and computation.
机译:数据流分析频繁依赖于识别相关性并在已经看到数据之后对数据进行了疑问。相关聚合形成此类查询的重要示例,其要求在一个维度的一个维度上的聚合,其在另一维上满足谓词。由于最近的事件通常比较旧的事件更重要,因此时间衰减也应该施加到下降重量不太重要的值。我们呈现空间高效的算法以及用于时间衰减相关和的空间下限,许多相关聚合的核心问题。通过考虑不同的基本类别的衰减函数,我们可以将可能的案例分开,其中可能是有效的相对误差或附加误差的情况,从其他情况下是必要的。特别是,我们表明,流行的滑动窗口和指数衰减模型,没有有效的算法,解决了一个开放的问题。结果令人惊讶,因为在这些衰减模型下的其他数据流问题中已知有效近似。这是更好地理解哪个步骤,使用有限的存储器和计算可以在大规模流上回答复杂的查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号