首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Modeling Over-Dispersion for Network Data Clustering
【24h】

Modeling Over-Dispersion for Network Data Clustering

机译:用于网络数据聚类的过度分散

获取原文

摘要

Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters. While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters, commonly existing in network communities and image segments, from network data with over-dispersed cluster size distribution. The latter is considered as an intrinsic structural property of the network data. In this paper, we propose a generalized probabilistic modeling framework, SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data. A wide range of cluster size distributions revealed by real-world network data can be accurately captured by our method. We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments. Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.
机译:过分分散的网络数据挖掘已经成为数据科学中的中心主题,通过具有不平衡的集群的现实网络数据量的急剧增加显而易见。虽然大多数现有的聚类方法被设计用于发现群集和类特定连接模式的数量,但是很少有方法可用于从具有超分散的群集大小分布的网络数据揭示省略的群体群体,通常存在于网络社区和图像段中。后者被认为是网络数据的内在结构特性。在本文中,我们提出了广泛的概率模型框架,SizeConnectivity,以估计来自网络数据的类特定连接模式的超分散的簇大小分布。我们的方法可以准确地捕获现实世界网络数据的广泛集群大小分布。我们对聚类社交网络数据和图像数据进行了广泛的合成和实验实验,用于检测网络社区和图像段。我们的结果表明我们的SizeConnectiventy聚类方法的优异性能通过建模过度分散恢复网络数据的隐藏结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号