首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium Workshops >On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices
【24h】

On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices

机译:在大规模图形生成,在边缘和顶点验证各种三角形统计数据

获取原文

摘要

Researchers developing implementations of distributed graph analytic algorithms require graph generators that yield graphs sharing the challenging characteristics of real-world graphs (small-world, scale-free, heavy-tailed degree distribution) with efficiently calculable ground-truth solutions to the desired output. Reproducibility for current generators [1] used in benchmarking are somewhat lacking in this respect due to their randomness: the output of a desired graph analytic can only be compared to expected values and not exact ground truth. Nonstochastic Kronecker product graphs [2] meet these design criteria for several graph analytics. Here we show that many flavors of triangle participation can be cheaply calculated while generating a Kronecker product graph. Given two medium-sized scale-free graphs with adjacency matrices A and B, their Kronecker product graph has adjacency matrix C = A ? B. Such graphs are highly compressible: |E| edges are represented in O(|E|1/2) memory and can be built in a distributed setting from small data structures, making them easy to share in compressed form. Many interesting graph calculations have worst-case complexity bounds O(|E|p) and often these are reduced to O(|E|p/2) for Kronecker product graphs, when a Kronecker formula can be derived yielding the sought calculation on C in terms of related calculations on A and B. We focus on deriving formulas for triangle participation at vertices, tC, a vector storing the number of triangles that every vertex is involved in, and triangle participation at edges, ΔC, a sparse matrix storing the number of triangles at every edge. When factors A and B are undirected, C is also undirected. In the case when both factors have no self loops we show tC= 2tA? tB, ΔC= ΔA? ΔB. Moreover, we derive the respective formulas when A and B have self loops, which boosts the triangle counts for the associated vertices/edges in C. We additionally demonstrate strong assumptions on B that allow the truss decomposition of C to be derived cheaply from the truss decomposition of A. We extend these results and show Kronecker formulas for triangle participation in both directed graphs and undirected, vertex-labeled graphs. In these classes of graphs each vertex / edge can participate in many different types of triangles.
机译:研究人员开发的分布式图形分析算法实现需要图形生成器,产量曲线与有效可计算的地面真解决方案所需的输出共享真实世界的图表(小世界,无标度,重尾度分布)的具有挑战性的特点。再现性在基准用电流发生器[1]在此方面有所欠缺,由于其随机性:期望的图形分析的输出只能相比预期值,而不是精确的地面实况。非随机克罗内克积的曲线图[2]满足这些设计标准几个图表分析。在这里,我们表明,三角参与的许多种可以同时产生克罗内克积图便宜地计算。给定两个中型无标度的曲线图与邻接矩阵A和B,它们的克罗内克积图具有邻接矩阵C = A? B.这些图是高度可压缩:| E |边缘为O表示(| E | 1/2 )存储器和可建在由小的数据结构的分布式设置,使它们易于份额以压缩的形式。许多有趣的图形计算具有最坏情况的复杂性界限O(| E | p ),往往这些被还原成O(| E | P / 2 )为克罗内克积图,在能够得到的克罗内克公式得到C上的要求计算在A和B相关计算方面我们专注于推导公式为在三角形的顶点的参与,叔<子的xmlns:MML = “http://www.w3.org/1998/Math/MathML” 的xmlns:的xlink = “http://www.w3.org/1999/xlink”> C ,存储三角形每个顶点参与,并在边缘处三角形参与的数目的向量,Δ<子的xmlns:MML = “http://www.w3.org/1998/Math/MathML” 的xmlns:的xlink = “http://www.w3.org/1999/xlink”> C ,稀疏矩阵存储的三角形的数量在每一个边缘。当因素A和B无向,C也无向。在这种情况下,当这两个因素都没有自我循环,我们示出了T<子的xmlns:MML = “http://www.w3.org/1998/Math/MathML” 的xmlns:的xlink = “http://www.w3.org/1999/xlink”> C = 2T处 a 还是T. b ,Δ<子的xmlns:MML = “http://www.w3.org/1998/Math/MathML” 的xmlns:的xlink = “http://www.w3.org/1999/xlink”> C a 还是δ. b 。此外,我们推导出相应的公式时A和B具有自我循环,这提升了相关联的顶点的三角形计数/边缘C.我们另外表明第B强假设允许C的桁架分解以从桁架廉价地衍生A.分解,我们扩展了这些结果,并显示克罗内克公式均向图三角形参与的和间接的,顶点标记的图。在这些类图的每个顶点/边可以参与许多不同类型的三角形。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号