首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Modeling Over-Dispersion for Network Data Clustering
【24h】

Modeling Over-Dispersion for Network Data Clustering

机译:为网络数据集群建模过度分散

获取原文

摘要

Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters. While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters, commonly existing in network communities and image segments, from network data with over-dispersed cluster size distribution. The latter is considered as an intrinsic structural property of the network data. In this paper, we propose a generalized probabilistic modeling framework, SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data. A wide range of cluster size distributions revealed by real-world network data can be accurately captured by our method. We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments. Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.
机译:过度分散的网络数据挖掘已成为数据科学的中心主题,集群不平衡的实际网络数据量急剧增加就证明了这一点。虽然大多数现有的群集方法都是为发现群集的数量和特定于类的连接模式而设计的,但很少有方法可以从群集大小分布过于分散的网络数据中发现网络社区和图像段中普遍存在的不平衡群集。后者被视为网络数据的固有结构属性。在本文中,我们提出了一个通用的概率建模框架SizeConnectivity,以估计过度分散的群集大小分布以及来自网络数据的类特定的连接模式。通过我们的方法可以准确地捕获现实网络数据揭示的各种群集大小分布。我们对社交网络数据和图像数据进行了聚类和合成实验,以检测网络社区和图像片段。我们的结果证明了我们的SizeConnectivity聚类方法在通过建模过度分散来恢复网络数据的隐藏结构方面具有出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号