首页> 外文学位 >A Practical Approach To Merging Multidimensional Data Models.
【24h】

A Practical Approach To Merging Multidimensional Data Models.

机译:合并多维数据模型的实用方法。

获取原文
获取原文并翻译 | 示例

摘要

Schema merging is the process of incorporating data models into an integrated, consistent schema from which query solutions satisfying all incorporated models can be derived. The efficiency of such a process is reliant on the effective semantic representation of the chosen data models, as well as the mapping relationships between the elements of the source data models.;Consider a scenario where, as a result of company mergers or acquisitions, a number of related, but possible disparate data marts need to be integrated into a global data warehouse. The ability to retrieve data across these disparate, but related, data marts poses an important challenge. Intuitively, forming an all-inclusive data warehouse includes the tedious tasks of identifying related fact and dimension table attributes, as well as the design of a schema merge algorithm for the integration. Additionally, the evaluation of the combined set of correct answers to queries, likely to be independently posed to such data marts, becomes difficult to achieve.;Model management refers to a high-level, abstract programming language designed to efficiently manipulate schemas and mappings. Particularly, model management operations such as match, compose mappings, apply functions and merge, offer a way to handle the above-mentioned data integration problem within the domain of data warehousing.;In this research, we introduce a methodology for the integration of star schema source data marts into a single consolidated data warehouse based on model management. In our methodology, we discuss the development of three (3) main streamlined steps to facilitate the generation of a global data warehouse. That is, we adopt techniques for deriving attribute correspondences, and for schema mapping discovery. Finally, we formulate and design a merge algorithm, based on multidimensional star schemas; which is primarily the core contribution of this research. Our approach focuses on delivering a polynomial time solution needed for the expected volume of data and its associated large-scale query processing.;The experimental evaluation shows that an integrated schema, alongside instance data, can be derived based on the type of mappings adopted in the mapping discovery step. The adoption of Global-And-Local-As-View (GLAV) mapping models delivered a maximally-contained or exact representation of all fact and dimensional instance data tuples needed in query processing on the integrated data warehouse. Additionally, different forms of conflicts, such as semantic conflicts for related or unrelated dimension entities, and descriptive conflicts for differing attribute data types, were encountered and resolved in the developed solution. Finally, this research has highlighted some critical and inherent issues regarding functional dependencies in mapping models, integrity constraints at the source data marts, and multi-valued dimension attributes. These issues were encountered during the integration of the source data marts, as it has been the case of evaluating the queries processed on the merged data warehouse as against that on the independent data marts.
机译:模式合并是将数据模型合并到一个集成的,一致的模式中的过程,从中可以得出满足所有合并模型的查询解决方案。这种过程的效率取决于所选数据模型的有效语义表示形式以及源数据模型的元素之间的映射关系。考虑一种由于公司合并或收购而导致的情况。需要将许多相关但可能不同的数据集市集成到全局数据仓库中。跨这些不同但相关的数据集市检索数据的能力提出了一个重要的挑战。直观地讲,形成一个包罗万象的数据仓库包括识别相关事实和维度表属性的繁琐任务,以及用于集成的模式合并算法的设计。此外,对可能对这些数据集市独立提出的查询的正确答案的组合集合的评估变得难以实现。模型管理指的是一种高级抽象编程语言,旨在有效地处理模式和映射。特别是,诸如匹配,组合映射,应用功能和合并之类的模型管理操作,提供了一种在数据仓库领域内解决上述数据集成问题的方法。在本研究中,我们介绍了一种星形集成方法模式源数据集市基于模型管理进入单个统一数据仓库。在我们的方法中,我们讨论了三(3)个主要精简步骤的开发,以简化全球数据仓库的生成。也就是说,我们采用了用于推导属性对应关系和用于模式映射发现的技术。最后,我们基于多维星型模式制定和设计合并算法;这主要是这项研究的核心贡献。我们的方法着重于为预期的数据量及其相关的大规模查询处理提供所需的多项式时间解决方案;实验评估表明,可以基于实例中采用的映射类型来导出集成模式和实例数据映射发现步骤。采用全局和局部视域(GLAV)映射模型可以最大程度地包含或精确表示集成数据仓库中查询处理所需的所有事实和维实例数据元组。此外,在开发的解决方案中遇到并解决了不同形式的冲突,例如相关或不相关维度实体的语义冲突以及不同属性数据类型的描述性冲突。最后,这项研究突出了一些关键和固有的问题,这些问题涉及映射模型中的功能依赖性,源数据集市上的完整性约束以及多值维属性。在源数据集市的集成过程中遇到了这些问题,因为相对于独立数据集市,评估合并数据仓库上处理的查询就是这种情况。

著录项

  • 作者

    Mireku Kwakye, Michael.;

  • 作者单位

    University of Ottawa (Canada).;

  • 授予单位 University of Ottawa (Canada).;
  • 学科 Computer Science.
  • 学位 M.Sc.
  • 年度 2011
  • 页码 162 p.
  • 总页数 162
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号