首页> 外文会议>Proceedings of the 7th international conference on emerging databases: technologies, applications, and theory >An Efficient Subgraph Compression-Based Technique for Reducing the I/O Cost of Join-Based Graph Mining Algorithms
【24h】

An Efficient Subgraph Compression-Based Technique for Reducing the I/O Cost of Join-Based Graph Mining Algorithms

机译:一种基于子图压缩的有效技术,可降低基于联接的图挖掘算法的I / O成本

获取原文
获取原文并翻译 | 示例

摘要

Many join-based graph mining algorithms such as triangle listing and clique enumeration output a large size of intermediate or final data that sometimes dominates the mining cost. A few researches highlighted on the size of output data. However, those techniques have limitation that they are highly specific to their corresponding graph mining algorithms. In this paper, through the careful observations of the output patterns, we propose a general compression solution that can be applied to any join-based graph algorithm. It first categorizes the overlapping and non-overlapping vertices in a resultant subgraph set of a join-based graph mining algorithm. Then it compresses the output data by removing the redundancy from the overlapping vertices and by encoding the non-overlapping vertices using a non-aligned hybrid bit vector compression technique. Our proposed technique performs the compression on-the-fly and can easily be adopted by the join-based graph mining algorithms. Experiments on the real datasets show that our proposed technique, which is adopted in a triangle listing algorithm, reduces the size of the output data and the running time by three times and more than two times, respectively. The proposed technique also reduces the I/O cost for a maximal clique listing algorithm.
机译:许多基于联接的图挖掘算法(例如三角形列表和集团枚举)输出大量的中间数据或最终数据,这些数据有时会主导挖掘成本。一些研究强调了输出数据的大小。但是,这些技术的局限性在于它们高度专用于其相应的图挖掘算法。在本文中,通过对输出模式的仔细观察,我们提出了一种通用压缩解决方案,该解决方案可应用于任何基于联接的图算法。它首先将基于联接的图挖掘算法的结果子图集中的重叠和非重叠顶点分类。然后,它通过从重叠顶点中删除冗余并使用非对齐混合位向量压缩技术对非重叠顶点进行编码来压缩输出数据。我们提出的技术可以即时执行压缩,并且可以很容易地被基于联接的图挖掘算法采用。在真实数据集上的实验表明,我们提出的技术被三角列表算法所采用,将输出数据的大小和运行时间分别减少了三倍和两倍以上。所提出的技术还减少了最大集团列表算法的I / O成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号