首页> 外文期刊>International Journal of Data Warehousing and Mining >Referential Horizontal Partitioning Selection Problem in Data Warehouses: Hardness Study and Selection Algorithms
【24h】

Referential Horizontal Partitioning Selection Problem in Data Warehouses: Hardness Study and Selection Algorithms

机译:数据仓库中的参照水平分区选择问题:硬度研究和选择算法

获取原文
获取原文并翻译 | 示例
       

摘要

Horizontal Partitioning has been largely adopted by the database community, where it took a significant part in the physical design process. Actually, it is supported by most commercial database systems (DBMS), where a native Data Definition Language for decomposing tables/materialized views using various modes is proposed. In traditional databases, horizontal partitioning has been largely studied, where several fragmentation algorithms were proposed to partition tables in isolation. In the relational data warehouse environment, horizontal partitioning consists in decomposing the whole warehouse schema into sub schemas, where each schema contains fragments of dimension and fact tables. Dimension tables are fragmented using the primary partitioning mode, whereas the fact table is divided using referential mode. In this article, the authors first focus on the evolution of horizontal partitioning in commercial DBMS motivated by decision support applications. Secondly, they give a formalization of the referential fragmentation schema selection problem in the data warehouse and they study its hardness to select an optimal solution. Due to its high complexity, they develop two algorithms: hill climbing and simulated annealing with several variants to select a near optimal partitioning schema. Finally, extensive experimental studies are conducted using the data set of APB1 benchmark to compare the quality the proposed algorithms using a mathematical cost model. Based on these experiments, some recommendations are given to advise database administrator for well using horizontal partitioning.
机译:水平分区已被数据库社区广泛采用,它在物理设计过程中发挥了重要作用。实际上,它受到大多数商业数据库系统(DBMS)的支持,其中提出了一种本机数据定义语言,用于使用各种模式分解表/实例化视图。在传统数据库中,对水平分区进行了广泛的研究,其中提出了几种碎片算法来隔离表。在关系数据仓库环境中,水平分区包括将整个仓库模式分解为子模式,其中每个模式都包含维和事实表的片段。维度表使用主要分区模式进行分段,而事实表使用引用模式进行划分。在本文中,作者首先关注由决策支持应用程序推动的商业DBMS中水平分区的发展。其次,他们给出了数据仓库中参考碎片模式选择问题的形式化形式,并研究了它的难点以选择最佳解决方案。由于其复杂性高,他们开发了两种算法:爬山和具有几种变体的模拟退火,以选择接近最佳的分区方案。最后,使用APB1基准数据集进行了广泛的实验研究,以使用数学成本模型比较所提出算法的质量。基于这些实验,提出了一些建议,以建议数据库管理员更好地使用水平分区。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号