【24h】

Adaptive Parallel Query Processing

机译:自适应并行查询处理

获取原文
获取原文并翻译 | 示例

摘要

The wide availability of clusters of low-cost personal computers (PCs) provides exciting opportunities to leverage on the available raw computing power to perform computationally intensive tasks. Particularly, we are interested in the leveraging of clusters of PC to parallelizing the task of query processing for data integration systems. In the database literature, most parallel query processing techniques focused on a coarse-grained approach towards query processing on multiple processors. Data is often partitioned across multiple processors and operators on each processor operate on a subset of the data. Adaptiveness was primarily achieved by run-time static and dynamic load balancing algorithms. In addition, depending on the type of partitioning technique used, data skew might occur which results in all the data being placed in one partition or execution skew might also occur, and result in all processing taking place on only one processor. In a wide-area environment, these traditional parallel query-processing techniques would not be effective since fluctuations pertinent to such environment are often not considered. Our main contribution lies in the Java implementation of a fine-grained adaptive parallel query processing mechanism that will adapt to these fluctuations in the query environment. We further proposed a new scheduling technique, called Tuple RTT scheduling, which will adapt to these run-time fluctuations and perform load balancing amongst multiple participating processors. Our initial implementation and performance study of the proposed scheduling technique indicates promising results.
机译:低成本个人计算机(PC)群集的广泛可用性为利用现有原始计算能力执行计算密集型任务提供了令人兴奋的机会。特别是,我们对利用PC集群来并行化数据集成系统的查询处理任务感兴趣。在数据库文献中,大多数并行查询处理技术都集中在针对多处理器查询处理的粗粒度方法上。数据通常在多个处理器之间进行分区,每个处理器上的运算符对数据的子集进行操作。自适应性主要是通过运行时静态和动态负载平衡算法来实现的。此外,根据所使用的分区技术的类型,可能会出现数据偏斜,导致所有数据都放在一个分区中,或者也可能发生执行偏斜,导致所有处理仅在一个处理器上进行。在广域环境中,这些传统的并行查询处理技术将无效,因为通常不会考虑与该环境有关的波动。我们的主要贡献在于细粒度的自适应并行查询处理机制的Java实现,该机制将适应查询环境中的这些波动。我们进一步提出了一种称为Tuple RTT调度的新调度技术,该技术将适应这些运行时波动并在多个参与的处理器之间执行负载平衡。我们对拟议的调度技术的初步实施和性能研究表明了令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号