首页> 外国专利> TRAVERSING A LARGE CONNECTED COMPONENT ON A DISTRIBUTED FILE-BASED DATA STRUCTURE

TRAVERSING A LARGE CONNECTED COMPONENT ON A DISTRIBUTED FILE-BASED DATA STRUCTURE

机译:在分布式文件的数据结构上遍历一个大连接的组件

摘要

A distributed system including multiple processing nodes. The distributed system can perform certain acts. The acts can include receiving a set of input nodes and a set of criteria. The acts can include obtaining an adjacency list representing a large connected component. The large connected component can include nodes, edges, and edge metadata. A quantity of the nodes of the large connected component can exceed 1 billion. The adjacency list can be distributed across the multiple processing nodes. The nodes of the large connected component can include the input nodes. The acts also can include performing one or more iterations of traversing the large connected component until a stopping condition is satisfied. Each iteration can include processing a set of input nodes at the multiple processing nodes using the set of criteria to generate first data at the multiple processing nodes, determining a set of output nodes such that each output node of the set of output nodes is one hop from a respective input node of the set of input nodes, consolidating the first data from the multiple processing nodes to a first processing node of the multiple processing nodes, processing the first data at the first processing node; and assigning the set of input nodes for a subsequent iteration of the one or more iterations based on the set of output nodes when the stopping condition is not satisfied. The acts further can include outputting second data based on the first data received and processed at the first processing node during the one or more iterations. Other embodiments are disclosed.
机译:分布式系统,包括多个处理节点。分布式系统可以执行某些行为。该动作可以包括接收一组输入节点和一组标准。该动作可以包括获得表示大连接组件的邻接列表。大连接组件可以包括节点,边缘和边缘元数据。大连接部件的节点的数量可超过10亿。邻接列表可以分布在多个处理节点上。大连接组件的节点可以包括输入节点。该动作还可以包括执行遍历大连接分量的一个或多个迭代,直到满足停止条件。每个迭代可以包括使用该组标准在多个处理节点处处理一组输入节点以在多个处理节点处生成第一数据,确定一组输出节点,使得该组输出节点的每个输出节点是一个跳根据该组输入节点集的相应输入节点,将从多个处理节点的第一数据整合到多个处理节点的第一处理节点,在第一处理节点处处理第一数据;并且在不满足停止条件时基于停止条件的设置基于输出节点集的随后迭代的输入节点集。该动作还可以包括基于在一个或多个迭代期间在第一处理节点处接收并处理的第一数据输出第二数据。公开了其他实施例。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号