首页> 外文会议>Data Engineering, ICDE, 2009 IEEE 25th International Conference on >Forward Decay: A Practical Time Decay Model for Streaming Systems
【24h】

Forward Decay: A Practical Time Decay Model for Streaming Systems

机译:前向衰减:流系统的实用时间衰减模型

获取原文

摘要

Temporal data analysis in data warehouses and datastreaming systems often uses time decay to reduce the importance of older tuples, without eliminating their influence, on the results of the analysis. While exponential time decay is commonly used in practice, other decay functions (e.g. polynomial decay) are not, even though they have been identified as useful. We argue that this is because the usual definitions of time decay are "backwards": the decayed weight of a tuple is based on its age, measured backward from the current time. Since this age is constantly changing, such decay is too complex and unwieldy for scalable implementation. In this paper, we propose a new class of "forward" decay functions based on measuring forward from a fixed point in time. We show that this model captures the more practical models already known, such as exponential decay and landmark windows, but also includes a wide class of other types of time decay. We provide efficient algorithms to compute a variety of aggregates and draw samples under forward decay, and show that these are easy to implement scalably. Further, we provide empirical evidence that these can be executed in a production data stream management system with little or no overhead compared to the undecayed computations. Our implementation required no extensions to the query language or the DSMS, demonstrating that forward decay represents a practical model of time decay for systems that deal with time-based data.
机译:数据仓库和数据流系统中的时间数据分析通常使用时间衰减来减少旧元组对分析结果的影响,而不会消除它们的影响。尽管在实践中通常使用指数时间衰减,但其他衰减函数(例如多项式衰减)却没有,即使它们已经被确定是有用的。我们认为这是因为时间衰减的通常定义是“向后”:元组的衰减权重基于其年龄(从当前时间向后测量)。由于这个时代在不断变化,因此这种衰减对于可扩展的实现而言太复杂且难以处理。在本文中,我们基于从固定时间点进行正向测量,提出了一类新的“正向”衰减函数。我们表明,该模型捕获了已知的更实用的模型,例如指数衰减和界标窗口,但还包括其他种类的时间衰减。我们提供了有效的算法来计算各种聚合,并在正向衰减下绘制样本,并证明这些易于实现且易于扩展。此外,我们提供的经验证据表明,与未衰减的计算相比,这些操作可以在生产数据流管理系统中执行,而几乎没有开销或没有开销。我们的实现不需要扩展查询语言或DSMS,这表明前向衰减表示处理基于时间的数据的系统的时间衰减的实用模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号