首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Scalable Scheduling of Updates in Streaming Data Warehouses
【24h】

Scalable Scheduling of Updates in Streaming Data Warehouses

机译:流数据仓库中更新的可伸缩计划

获取原文
获取原文并翻译 | 示例
           

摘要

We discuss update scheduling in streaming data warehouses, which combine the features of traditional data warehouses and data stream systems. In our setting, external sources push append-only data streams into the warehouse with a wide range of interarrival times. While traditional data warehouses are typically refreshed during downtimes, streaming warehouses are updated as new data arrive. We model the streaming warehouse update problem as a scheduling problem, where jobs correspond to processes that load new data into tables, and whose objective is to minimize data staleness over time (at time t, if a table has been updated with information up to some earlier time r, its staleness is t minus r). We then propose a scheduling framework that handles the complications encountered by a stream warehouse: view hierarchies and priorities, data consistency, inability to preempt updates, heterogeneity of update jobs caused by different interarrival times and data volumes among different sources, and transient overload. A novel feature of our framework is that scheduling decisions do not depend on properties of update jobs (such as deadlines), but rather on the effect of update jobs on data staleness. Finally, we present a suite of update scheduling algorithms and extensive simulation experiments to map out factors which affect their performance.
机译:我们讨论了流数据仓库中的更新调度,它结合了传统数据仓库和数据流系统的功能。在我们的设置中,外部源将具有广泛到达时间的仅追加数据流推入仓库。传统的数据仓库通常在停机期间进行刷新,而流数据仓库则在新数据到达时进行更新。我们将流式仓库更新问题建模为调度问题,其中作业对应于将新数据加载到表中的进程,并且其目标是最大程度地减少一段时间内的数据陈旧性(在时间t,如果表已被更新了一些信息)时间r较早,其陈旧时间为t减去r)。然后,我们提出一个调度框架来处理流仓库遇到的复杂问题:查看层次结构和优先级,数据一致性,无法抢占更新,由于不同来源之间的不同到达时间和数据量而导致的更新作业的异构性以及瞬时过载。我们框架的一个新颖特征是,调度决策不取决于更新作业的属性(例如截止日期),而是取决于更新作业对数据陈旧性的影响。最后,我们提出了一套更新调度算法和广泛的仿真实验,以找出影响其性能的因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号