首页> 外文会议>International Conference on Information and Knowledge Engineering >Designing Genetic Algorithms for Materialized View and Index Selection
【24h】

Designing Genetic Algorithms for Materialized View and Index Selection

机译:设计遗传算法,实现物化视图和索引选择

获取原文
获取外文期刊封面目录资料

摘要

Organizational decision-making involves accessing and integrating data that resides in various autonomous, localized databases. Our approach to providing integrated access to multiple databases is the data warehousing approach: we assume data is extracted from different sources in advance, integrated, and stored at a centralized location to answer queries posed to support decision-making. The stored data sets are materialized relational views, and auxiliary data structures such as indexes can be built on these materialized views to speed up data retrieval. Since the amount of data available in source databases can be much larger than the available storage space in the warehouse, not all source data can be materialized. Materialized views are built on frequently posed queries and any other data is communicated from the sources when required. Identifying the data to be materialized as views and the indexes to be built on these views is a leading research issue in data warehousing. Due to the large search space for this problem, we explore the use of genetic algorithms (GAs) to select materialized views and indexes in a data warehouse. We minimize query response time for a given workload while also considering a limit on storage space. In this paper, we discuss the design of a genetic algorithm including creating the initial solution space, encoding the problem as a chromosome, generating new populations, and evaluating the fitness of a chromosome. We illustrate the approach for a relational data warehouse since it is well-known and widely used.
机译:组织决策涉及访问和集成驻留在各种自主,本地化数据库中的数据。我们提供对多个数据库的集成访问的方法是数据仓库方法:我们假设预先从不同的来源提取数据,集成,并存储在集中位置以应答向支持决策的查询。存储的数据集是物化的关系视图,并且可以在这些物化视图上构建诸如索引的辅助数据结构以加速数据检索。由于源数据库中可用的数据量可以大于仓库中的可用存储空间,因此并非所有源数据都可以实现。基于频繁提出的查询构建了物化视图,并且在需要时从源传送任何其他数据。识别要归化为视图和建立在这些视图上的索引的数据是数据仓库中的一个主要研究问题。由于此问题的庞大搜索空间,我们探讨了遗传算法(气体)的使用来选择数据仓库中的物化视图和索引。我们最小化了给定工作负载的查询响应时间,同时考虑存储空间限制。在本文中,我们讨论了遗传算法的设计,包括创建初始解决方案空间,将问题作为染色体,产生新种群,评估染色体的适应性。我们说明了关系数据仓库的方法,因为它是众所周知和广泛使用的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号