首页> 外文会议>International conference on management of data >MaSM: Efficient Online Updates in Data Warehouses
【24h】

MaSM: Efficient Online Updates in Data Warehouses

机译:MaSM:数据仓库中的高效在线更新

获取原文
获取外文期刊封面目录资料

摘要

Data warehouses have been traditionally optimized for read-only query performance, allowing only offline updates at night, essentially trading off data freshness for performance. The need for 24x7 operations in global markets and the rise of online and other quickly-reacting businesses make concurrent online updates increasingly desirable. Unfortunately, state-of-the-art approaches fall short of supporting fast analysis queries over fresh data. The conventional approach of performing updates in place can dramatically slow down query performance, while prior proposals using differential updates either require large in-memory buffers or may incur significant update migration cost. This paper presents a novel approach for supporting online updates in data warehouses that overcomes the limitations of prior approaches, by making judicious use of available SSDs to cache incoming updates. We model the problem of query processing with differential updates as a type of outer join between the data residing on disks and the updates residing on SSDs. We present MaSM algorithms for performing such joins and periodic migrations, with small memory footprints, low query overhead, low SSD writes, efficient in-place migration of updates, and correct ACID support. Our experiments show that MaSM incurs only up to 7% overhead both on synthetic range scans (varying range size from 100GB to 4KB) and in a TPC-H query replay study, while also increasing the update throughput by orders of magnitude.
机译:数据仓库传统上针对只读查询性能进行了优化,允许夜间脱机更新,基本上交易数据新鲜度以进行性能。在全球市场中需要24X7个行动和在线和其他迅速反应业务的兴起使同时的在线更新越来越可取。遗憾的是,最先进的方法缺乏在新数据上支持快速分析查询。执行更新到位的传统方法可以显着减慢查询性能,而使用差分更新的先前提案需要大的内存缓冲区,或者可能会产生重大的更新迁移成本。本文通过使可用的SSD可用以缓存传入的更新,提供了一种支持数据仓库中的在线更新的新方法,这些方法克服了现有方法的局限性。我们将差异更新的查询处理问题模型作为驻留在磁盘上的数据和驻留在SSD上的更新之间的外部连接的类型。我们提出了用于执行此类联接和周期性迁移的MASM算法,具有小的内存占用,低查询开销,低SSD写入,更新的有效的就地迁移,以及正确的酸支撑。我们的实验表明,MASM在合成范围扫描(从100GB至4KB的不同范围大小)和TPC-H查询重播研究中,MASM均仅在7%开销中,同时也增加了按数量级的更新吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号