首页> 外国专利> ADAPTIVE HANDLING OF SKEW FOR DISTRIBUTED JOINS IN A CLUSTER

ADAPTIVE HANDLING OF SKEW FOR DISTRIBUTED JOINS IN A CLUSTER

机译:集群中分布式联接的偏斜自适应处理

摘要

Techniques for detecting data skew while performing a distributed join operation on tables in a cluster of nodes managed by database management system (cDBMS), is disclosed. In an embodiment, heavy hitter values in a join column of a table are determined during the runtime of a distributed join operation of the table with another table. The cDBMS keeps in a datastore a count for each unique value read from the join column of the table. The datastore may be a hash table with the unique values serving as keys and may additionally include a heap or a sorted array for an efficient count based traversal. When a count for a particular value in the datastore exceeds a threshold, then the particular value is identified as a heavy hitter value. The tuples from the joined table that include the heavy hitter value, are kept local at the node that the tuples were originally distributed to, while the other joined table tuples are broadcasted to one or more nodes of the cDBMS that at least include the originally distributed nodes.
机译:公开了用于在对由数据库管理系统(cDBMS)管理的节点的集群中的表上执行表的分布式联接操作时检测数据偏斜的技术。在一个实施例中,在表与另一表的分布式联接操作的运行期间确定表的联接列中的重击球者值。 cDBMS在数据存储区中保留一个从表的联接列读取的每个唯一值的计数。数据存储区可以是具有唯一值作为键的哈希表,并且可以另外包含堆或排序数组,以实现基于有效计数的遍历。当数据存储区中特定值的计数超过阈值时,该特定值将被识别为严重的击球手值。来自联接表的元组(包含沉重的击球手值)在该组元最初分配到的节点处保持本地状态,而其他联接表元组被广播到至少包含原始分配的cDBMS的一个或多个节点节点。

著录项

  • 公开/公告号US2016267135A1

    专利类型

  • 公开/公告日2016-09-15

    原文格式PDF

  • 申请/专利权人 ORACLE INTERNATIONAL CORPORATION;

    申请/专利号US201514871490

  • 发明设计人 WOLF ROEDIGER;SAM IDICULA;

    申请日2015-09-30

  • 分类号G06F17/30;

  • 国家 US

  • 入库时间 2022-08-21 14:38:54

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号