首页> 外文会议>ACM SIGMOD International Conference on Management of Data >An Adaptive Query Execution System for Data Integration
【24h】

An Adaptive Query Execution System for Data Integration

机译:用于数据集成的自适应查询执行系统

获取原文

摘要

Query processing in data integration occurs over network-bound, autonomous data sources. This requires extensions to traditional optimization and execution techniques for three reasons: there is an absence of quality statistics about the data, data transfer rates are unpredictable and bursty, and slow or unavailable data sources can often be replaced by overlapping or mirrored sources. This paper presents the Tukwila data integration system, designed to support adaptivity at its core using a two-pronged approach. Interleaved planning and execution with partial optimization allows Tukwila to quickly recover from decisions based on inaccurate estimates. During execution, Tukwila uses adaptive query operators such as the double pipelined hash join, which produces answers quickly, and the dynamic collector, which robustly and efficiently computes unions across overlapping data sources. We demonstrate that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and we present experimental evidence that our techniques result in behavior desirable for a data integration system.
机译:通过网络绑定的自主数据源发生数据集成中的查询处理。这需要扩展到传统的优化和执行技术,原因有三个原因:存在关于数据的质量统计数据,数据传输速率是不可预测和突发的,并且慢速或不可用的数据源通常可以通过重叠或镜像来源替换。本文介绍了Tukwila数据集成系统,旨在使用双管齐下的方法支持核心的适应性。使用部分优化的交错规划和执行允许Tukwila根据不准确的估计来快速从决策中恢复。在执行期间,Tukwila使用自适应查询运算符,例如双流水线散列连接,其快速生成答案,动态收集器,其稳健地和有效地计算在重叠数据源上的联合。我们展示了Tukwila架构在自适应执行中扩展了以前的创新(例如查询加扰,中间执行重新优化和选择节点),并且我们提出了我们的技术导致数据集成系统所希望的行为的实验证据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号