...
首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Constant-Time Sliding Window Framework with Reduced Memory Footprint and Efficient Bulk Evictions
【24h】

Constant-Time Sliding Window Framework with Reduced Memory Footprint and Efficient Bulk Evictions

机译:恒定时间滑动窗口框架,具有减少的内存占用量和有效的批量驱逐

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the monitoring of telemetry generated in large data centers, a common demand for currently emerging scenarios is the need to process vast amounts of data with low latencies, generally performing the analysis process as close to the data source as possible. Stream processing platforms are required to be malleable and absorb spikes generated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggregate data with minimal computational cost need to be developed. In this paper we present the Monoid Tree Aggregator general sliding window aggregation framework, which seamlessly combines the following features: amortized O(1) time complexity and a worst-case of O(log n) between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amortized O(1) computational cost for single evictions and supports bulk evictions with cost O(log n); and it requires a local memory space of O(log n). The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributed Key-Value Stores to keep window elements and partial aggregations.
机译:数据分析平台的快速发展导致对实时数据流处理的需求不断增长。从物联网应用程序到大型数据中心中生成的遥测监控,对当前出现的场景的共同需求是需要以低延迟处理大量数据,通常在尽可能接近数据源的位置执行分析过程。流处理平台需要具有可延展性,并吸收数据生成速率波动所产生的峰值。数据通常作为时间序列生成,必须使用多个运算符进行汇总,这是滑动窗口,是用于实时处理数据的最常见抽象之一。为了满足上述需求,需要开发以最小的计算成本来聚合数据的有效流处理技术。在本文中,我们提出了Monoid Tree Aggregator通用滑动窗口聚合框架,该框架无缝地结合了以下功能:摊销O(1)时间复杂度和两次插入之间O(log n)的最坏情况;它提供了窗口聚合机制和窗口滑动策略,这些都是用户可编程的;窗口滑动策略的执行会显示单笔收回的摊销O(1)计算成本,并支持成本为O(log n)的大量收回;并且它需要O(log n)的本地存储空间。该框架可以计算多个数据维度上的聚合,并且已设计为通过使用分布式键值存储来保持窗口元素和部分聚合来支持计算和数据存储的解耦。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号