首页> 中文期刊> 《计算机应用与软件》 >基于信息网模型的分布并行多连接查询优化

基于信息网模型的分布并行多连接查询优化

     

摘要

在分布式集群系统中,数据根据划分算法存储在集群的各个节点,这为涉及大量连接操作的复杂查询带来了昂贵的网络开销.针对该问题,基于信息网模型INM(Information Network Mode),提出最小通信量查询划分算法和多目标查询优化算法.其中查询划分算法将复杂查询划分成多个PWOC(parallelizable without communication)子查询,所有子查询可近似无通信地并行执行.多目标优化算法将子查询作为查询计划的基本操作,并将并行性和通信代价同时作为驱动目标,以传统多目标加权算法结合贪心策略作为评估依据生成查询计划树.最后,系统基于TPC-H基准生成测试数据,将原始算法与优化算法进行了对比实验,结果表明优化算法可以极大提高复杂查询的效率.%In the distributed cluster system, data is partitioned in different nodes according to data partition algorithm, which causes expensive network communication expense for the complex multi-join query.To solve the problem, the Minimum Traffic Query Split Algorithm(MTQS) and the Multi-Objective Query Optimization Algorithm (MOQO) based on the Information Network Model are proposed.Among these two algorithms, MTQS is aimed at splitting query into several parallelizable without communication (PWOC) sub-queries, which guarantees every sub-query parallels approximately without communication.MOQO takes sub-query as the basic operation, which puts the parallelism and communication cost as goal driven and builds the query plan tree combining the traditional Multi-Objective weighted algorithm with the greedy algorithm as the assessing accordance.In the end, the system generates test data by TPC-H benchmark and conducts a comparative experiment between the previous and optimal algorithm, the result proves that the optimal algorithm improves the efficiency of complex query significantly.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号