首页> 外文会议>IEEE International Conference on High Performance Computing >Distributed Algorithm for High-Utility Subgraph Pattern Mining Over Big Data Platforms
【24h】

Distributed Algorithm for High-Utility Subgraph Pattern Mining Over Big Data Platforms

机译:大型数据平台高实用子图模式挖掘分布式算法

获取原文

摘要

Frequent subgraph pattern mining (FSM) finds subgraph patterns that occur in a graph database with a frequency that is more than a given threshold. In FSM, the notion of occurrence captures the presence or absence of a node and an edge in a binary fashion and considers relevance of each edge or node as same. However, an edge or a node may have different relevancy score. Therefore, the utility of a pattern should be defined using the relevance score of participating edges or nodes. This paper defines the utility notion of a pattern using this idea and presents algorithms to mine high-utility patterns from a given graph database. A significant issue in high-utility pattern mining is that the antimonotonic property no longer holds contrary to the FSM. Hence pruning of the search space becomes a daunting task. To address this issue, we incorporate a function to estimate an upper-bound utility of a pattern object that also satisfies the anti-monotonic property. This paper presents three optimization heuristics for the solution on a distributed platform, namely, a novel use of bloom filter to avoid exploration of non-candidates, avoidance of sending database information with each pattern, and avoidance of sending pattern embeddings with each pattern. The experimental study on Apache Spark shows the effectiveness of our proposed optimization strategies.
机译:频繁的子图模式挖掘(FSM)在图表数据库中发现具有多于给定阈值的频率的子图模式。在FSM中,出现的概念以二进制方式捕获节点和边缘的存在或不存在,并认为每个边缘或节点的相关性。然而,边缘或节点可能具有不同的相关性分数。因此,应使用参与边或节点的相关性得分来定义模式的实用程序。本文使用此思想定义了模式的实用程序概念,并从给定的图形数据库中呈现给挖掘高实用图案的算法。高效用模式挖掘中的一个重要问题是锑旋律性质不再与FSM相反。因此,搜索空间的修剪成为令人生畏的任务。要解决此问题,我们将包含一个函数来估计图案对象的上限实用程序,该实用程序也满足反单调属性。本文介绍了分布式平台解决方案的三种优化启发式,即新颖的盛开过滤器,以避免探索非候选者,避免将数据库信息发送到每个模式,并避免使用每个模式发送模式嵌入的嵌入模式。对Apache Spark的实验研究表明了我们所提出的优化策略的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号