首页> 外文期刊>International Journal of Engineering Science and Technology >A COMBINED ALGORITHM FOR DATA WAREHOUSE FRAGMENTATION SELECTION
【24h】

A COMBINED ALGORITHM FOR DATA WAREHOUSE FRAGMENTATION SELECTION

机译:数据仓库碎片选择的组合算法

获取原文
           

摘要

Data warehouses are designed to handle the queries required to discover trends and critical factors for Online Analytical Processing (OLAP) systems. Such systems are composed of multiple dimension tables and fact tables (in the form of star schema). Queries running on such systems contain a large number of costlier joins, selections and aggregations. To optimize these queries, the use of advanced optimization techniques is necessary. Data partitioning that has been studied in the context of data warehouse aims to reduce query execution time and to facilitate the parallel execution of these queries. Horizontal partitioning is one of the important aspects of such data partitioning technique. It is a divide-and-conquer approach that improves query performance, operational scalability, and the management of ever-increasing amounts of data. It improves performance of queries by the means of pruning mechanism that reduces the amount of data retrieved from the disk. The horizontal partitioning approach consider several dimension tables involved in the queries and the number of fact fragments generated by this partitioning methodology can be very huge and it is difficult for the data warehouse administrator to maintain all the fragments. Hence it is necessary select optimal set of fragments that are manageable in the underlying database. In this paper we proposed combined hill climbing and genetic algorithm in order to enhance fragmentation selection for horizontal partitioning approach. Our experimental results show that our method can provide a significantly better solution than existing fragmentation selection techniques in terms of minimization of query processing time.
机译:数据仓库旨在处理发现在线分析处理(OLAP)系统趋势和关键因素所需的查询。这样的系统由多维表和事实表(以星型模式的形式)组成。在此类系统上运行的查询包含大量昂贵的联接,选择和聚合。为了优化这些查询,必须使用高级优化技术。在数据仓库环境中研究的数据分区旨在减少查询执行时间并促进这些查询的并行执行。水平分区是这种数据分区技术的重要方面之一。它是一种分而治之的方法,可以提高查询性能,操作可伸缩性以及对不断增长的数据量进行管理。它通过减少从磁盘检索的数据量的修剪机制提高了查询性能。水平分区方法考虑了查询中涉及的几个维度表,并且这种分区方法所生成的事实片段的数量可能非常庞大,并且数据仓库管理员很难维护所有片段。因此,有必要选择可在基础数据库中管理的最佳片段组。在本文中,我们提出了结合爬山和遗传算法的方法,以增强水平分割方法的碎片选择。我们的实验结果表明,在最小化查询处理时间方面,我们的方法可以提供比现有的碎片选择技术更好的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号