首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >New algorithms for parallelizing relational database joins in the presence of data skew
【24h】

New algorithms for parallelizing relational database joins in the presence of data skew

机译:存在数据倾斜时用于并行化关系数据库联接的新算法

获取原文
获取原文并翻译 | 示例

摘要

Parallel processing is an attractive option for relational database systems. As in any parallel environment however, load balancing is a critical issue which affects overall performance. Load balancing for one common database operation in particular, the join of two relations, can be severely hampered for conventional parallel algorithms, due to a natural phenomenon known as data skew. In a pair of recent papers (J. Wolf et al., 1993; 1993), we described two new join algorithms designed to address the data skew problem. We propose significant improvements to both algorithms, increasing their effectiveness while simultaneously decreasing their execution times. The paper then focuses on the comparative performance of the improved algorithms and their more conventional counterparts. The new algorithms outperform their more conventional counterparts in the presence of just about any skew at all, dramatically so in cases of high skew.
机译:对于关系数据库系统,并行处理是一个有吸引力的选择。但是,与在任何并行环境中一样,负载平衡是影响整体性能的关键问题。对于传统的并行算法,由于一种称为数据偏斜的自然现象,可能会严重阻碍一种常见数据库操作的负载平衡,尤其是两种关系的结合。在最近的两篇论文中(J. Wolf等,1993; 1993),我们描述了两个新的联接算法,旨在解决数据倾斜问题。我们建议对这两种算法进行重大改进,以提高其有效性,同时减少其执行时间。然后,本文将重点放在改进算法与传统算法的比较性能上。在几乎没有任何歪斜的情况下,新算法的性能要比传统算法高,而在高歪斜情况下则是如此。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号