New algorithms for parallelizing relational database joins in the presence of data skew

Wolf J.L.; Dias D.M.

首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >New algorithms for parallelizing relational database joins in the presence of data skew

【24h】

New algorithms for parallelizing relational database joins in the presence of data skew

机译：存在数据倾斜时用于并行化关系数据库联接的新算法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Parallel processing is an attractive option for relational database systems. As in any parallel environment however, load balancing is a critical issue which affects overall performance. Load balancing for one common database operation in particular, the join of two relations, can be severely hampered for conventional parallel algorithms, due to a natural phenomenon known as data skew. In a pair of recent papers (J. Wolf et al., 1993; 1993), we described two new join algorithms designed to address the data skew problem. We propose significant improvements to both algorithms, increasing their effectiveness while simultaneously decreasing their execution times. The paper then focuses on the comparative performance of the improved algorithms and their more conventional counterparts. The new algorithms outperform their more conventional counterparts in the presence of just about any skew at all, dramatically so in cases of high skew.

机译：对于关系数据库系统，并行处理是一个有吸引力的选择。但是，与在任何并行环境中一样，负载平衡是影响整体性能的关键问题。对于传统的并行算法，由于一种称为数据偏斜的自然现象，可能会严重阻碍一种常见数据库操作的负载平衡，尤其是两种关系的结合。在最近的两篇论文中（J. Wolf等，1993； 1993），我们描述了两个新的联接算法，旨在解决数据倾斜问题。我们建议对这两种算法进行重大改进，以提高其有效性，同时减少其执行时间。然后，本文将重点放在改进算法与传统算法的比较性能上。在几乎没有任何歪斜的情况下，新算法的性能要比传统算法高，而在高歪斜情况下则是如此。

著录项

来源
《IEEE Transactions on Knowledge and Data Engineering》 |1994年第6期|P.990-997|共8页
作者
Wolf J.L.; Dias D.M.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. PARALLEL ALGORITHMS FOR THE EXECUTION OF RELATIONAL DATABASE OPERATIONS REVISITED ON GRIDS [J] . Werner Mach, Erich Schikuta International Journal of High Performance Computing Applications . 2009,第2期

机译：在网格上重新执行关系数据库操作的并行算法
2. A parallel sort merge join algorithm for managing data skew [J] . Wolf J.L., Dias D.M. IEEE Transactions on Parallel and Distributed Systems . 1993,第1期

机译：用于管理数据偏斜的并行排序合并联接算法
3. A parallel hash join algorithm for managing data skew [J] . Wolf J.L., Yu P.S. IEEE Transactions on Parallel and Distributed Systems . 1993,第12期

机译：用于管理数据偏斜的并行哈希联接算法
4. An effective algorithm for parallelizing hash joins in the presence of data skew [C] . Wolf, J.L., Dias, . 1991

机译：在存在数据偏斜的情况下并行化哈希联接的有效算法
5. Fast parallel algorithms on a class of graph structures with applications in relational databases and computer networks [D] . Radhakrishnan, Sridhar 1990

机译：一类图结构上的快速并行算法及其在关系数据库和计算机网络中的应用
6. Efficient Serial and Parallel Algorithms for Selection of Unique Oligos in EST Databases [O] . Manrique Mata-Montero, Nabil Shalaby, Bradley Sheppard 2013

机译：在EST数据库中选择唯一寡核苷酸的高效串行和并行算法
7. Skew-insensitive Parallel Algorithms for Relational Join [O] . Department of Computer Science, College of Computer & Information Sciences King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia ( host institution ), AlSabti, Khaled ( author ), Ranka, Sanjay ( UF author ) 2001

机译：关系连接的偏斜不敏感并行算法
8. Multiprocessor Sort-Merge Join Algorithm for Relational Databases [R] . Thompson, W. C., Ries, D. R. 1981

机译：关系数据库的多处理器排序合并连接算法

New algorithms for parallelizing relational database joins in the presence of data skew

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅