【24h】

Transform Merging of ETL Data Flow Plan

机译:ETL数据流程的变换合并

获取原文

摘要

ETL (Extract-Transform-Load) is the process of loading a data mart/warehouse. It is often modeled in a data flow Plan in which individual transforms perform various types of operations to convert, cleanse, and integrate dissimilar source data before loading into the target system. Due to the nature of complexity, high-performance process is often hard to achieve. In this paper, we present a transform merging technique to improve the ETL performance. This technique reshapes the transformation plan from the existing transform-based process to a new column-based process where independent column threads can be produced and executed in parallel. The plan reshaping is achieved by plan analysis and transform merge. Our implementation shows that this technique delivers significant performance improvement over the existing execution mechanisms.
机译:ETL(提取变换负载)是加载数据集市/仓库的过程。它通常在数据流程中建模,其中各个变换在加载到目标系统之前执行各种类型的操作以进行转换,清除和集成不同的源数据。由于复杂性的性质,高性能过程往往难以实现。在本文中,我们提出了一种改进ETL性能的变换合并技术。该技术将转换计划从现有的基于变换的过程中的转换计划重新插入到基于新列的过程,其中可以并行地生成和执行独立列线程。计划重塑通过计划分析和变换合并来实现。我们的实现表明,该技术对现有的执行机制提供了显着的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号