首页> 外文期刊>Information Sciences: An International Journal >Summarized bit batch-based triangle listing in massive graphs
【24h】

Summarized bit batch-based triangle listing in massive graphs

机译:概括基于比特批处理的大规模图中的三角形列表

获取原文
获取原文并翻译 | 示例
           

摘要

The presence of triangles in massive graphs provides many important uses in different graph algorithms, such as finding highly relevant vertices for dense subgraph mining, measuring the clustering coefficient, and computing the transitivity for network analysis. In memory algorithms cannot be used for triangle listing in massive graphs because the graphs are too large to fit into memory. External memory-based techniques address this problem by focusing on the I/O efficiency to improve performance. However, triangulation is a CPU intensive process that iteratively joins lists of neighbors to determine the adjacent vertices in each triangle. Therefore, the cost of a triangle listing algorithm on a massive graph is dominated by the join operations among the lists of neighbors. In this paper, we propose a disk-based triangle listing approach that uses an efficient technique to join the lists of neighbors by exploiting CPU parallelism through bitwise operations. We represent the lists of neighbors using bit vectors and compress them using our proposed summarized bit batch, which allows the bitwise operations to be performed directly on the compressed data. Our proposed technique slices a bit vector into a series of word length bit batches that it summarizes by pruning the bit batches that contain only 0-bits. Then our proposed approach for listing the triangles asynchronously accesses the summarized bit batches and joins them using bitwise operations. Our proposed technique achieves 40% higher compression for some real world datasets compared to the classic compression technique. The triangulation technique using the summarized bit batches also significantly outperforms the existing solutions in terms of wall clock time. (C) 2018 Elsevier Inc. All rights reserved.
机译:大规模图中的三角形的存在在不同的图形算法中提供了许多重要用途,例如查找高度相关的顶点以用于密集的子图挖掘,测量聚类系数,并计算网络分析的传递。在内存算法中,不能用于大规模图中的三角形列表,因为图形太大而无法适合内存。基于外部内存的技术通过关注I / O效率来提高性能来解决此问题。但是,三角测量是一个CPU强化进程,可迭代地加入邻居列表以确定每个三角形中的相邻顶点。因此,在大规模图形上的三角形清单算法的成本由邻居列表中的加入操作主导。在本文中,我们提出了一种基于磁盘的三角形列表方法,它通过通过按位操作利用CPU并行性来加入邻居列表。我们代表使用位向量的邻居列表,并使用我们提出的总结比特批量压缩它们,这允许直接在压缩数据上执行按位操作。我们所提出的技术将比特向量切片成一系列字长位批次,它通过修剪仅包含0位的位批次来总结。然后我们提出的方法是列出三角形的方法异步地访问总结位批处理,并使用按位操作加入它们。与经典压缩技术相比,我们所提出的技术对某些真实世界数据集进行了40%的压缩。使用总结比特批次的三角测量技术也显着优于壁钟时间现有的解决方案。 (c)2018年Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号