首页> 外文会议>International Conference on Data Mining >gApprox: Mining Frequent Approximate Patterns from a Massive Network
【24h】

gApprox: Mining Frequent Approximate Patterns from a Massive Network

机译:Gappox:挖掘来自大规模网络的常见近似模式

获取原文

摘要

Recently, there arise a large number of graphs with massive sizes and complex structures in many new applications, such as biological networks, social networks, and the Web, demanding powerful data mining methods. Due to inherent noise or data diversity, it is crucial to address the issue of approximation, if one wants to mine patterns that are potentially interesting with tolerable variations. In this paper, we investigate the problem of mining frequent approximate patterns from a massive network and propose a method called gApprox. gApprox not only finds approximate network patterns, which is the key for many knowledge discovery applications on structural data, but also enriches the library of graph mining methodologies by introducing several novel techniques such as: (1) a complete and redundancy-free strategy to explore the new pattern space faced by gApprox; and (2) transform "frequent in an approximate sense " into an anti-monotonic constraint so that it can be pushed deep into the mining process. Systematic empirical studies on both real and synthetic data sets show that frequent approximate patterns mined from the worm protein-protein interaction network are biologically interesting and gApprox is both effective and efficient.
机译:最近,在许多新的应用程序中出现大量具有大量尺寸和复杂结构的图表,例如生物网络,社交网络和网络,要求强大的数据挖掘方法。由于固有的噪声或数据分集,解决近似问题是至关重要的,如果想要挖掘具有可容忍变化的可能有趣的模式。在本文中,我们调查了来自大规模网络的频繁近似模式的挖掘问题,并提出了一种称为Gappox的方法。 Gappox不仅找到了近似的网络模式,这是结构数据上许多知识发现应用的关键,而且还通过引入几种新颖的技术(例如:(1)一个完整和冗余的策略来丰富图形挖掘方法库。 Gappox面临的新图案空间; (2)将“频繁在近似感觉中”转换为反单调约束,以便它可以将其深入挖掘到采矿过程中。关于实际和合成数据集的系统实证研究表明,频繁的近似模式从蠕虫蛋白 - 蛋白质相互作用网络中开采的近似模式是生物学上的有趣,Gappox既有效又有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号