Modeling Over-Dispersion for Network Data Clustering

机译：用于网络数据聚类的过度分散

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters. While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters, commonly existing in network communities and image segments, from network data with over-dispersed cluster size distribution. The latter is considered as an intrinsic structural property of the network data. In this paper, we propose a generalized probabilistic modeling framework, SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data. A wide range of cluster size distributions revealed by real-world network data can be accurately captured by our method. We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments. Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.

机译：过分分散的网络数据挖掘已经成为数据科学中的中心主题，通过具有不平衡的集群的现实网络数据量的急剧增加显而易见。虽然大多数现有的聚类方法被设计用于发现群集和类特定连接模式的数量，但是很少有方法可用于从具有超分散的群集大小分布的网络数据揭示省略的群体群体，通常存在于网络社区和图像段中。后者被认为是网络数据的内在结构特性。在本文中，我们提出了广泛的概率模型框架，SizeConnectivity，以估计来自网络数据的类特定连接模式的超分散的簇大小分布。我们的方法可以准确地捕获现实世界网络数据的广泛集群大小分布。我们对聚类社交网络数据和图像数据进行了广泛的合成和实验实验，用于检测网络社区和图像段。我们的结果表明我们的SizeConnectiventy聚类方法的优异性能通过建模过度分散恢复网络数据的隐藏结构。

著录项

来源
《IEEE International Conference on Machine Learning and Applications》|2017年|579|共8页
会议地点
作者
Lu Wang; Dongxiao Zhu; Ming Dong; Yan Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP181-53;
关键词
Data models; Clustering methods; Image segmentation; Probabilistic logic; Clustering algorithms; Mixture models; Image edge detection;

机译：数据模型;聚类方法;图像分割;概率逻辑;聚类算法;混合模型;图像边缘检测;

相似文献

外文文献
中文文献
专利

1. Testing homogeneity in clustered (longitudinal) count data regression model with over-dispersion [J] . Paul S., Azad K. Journal of Statistical Planning and Inference . 2012,第6期

机译：在具有过度分散的聚类（纵向）计数数据回归模型中测试同质性
2. A data transformation to deal with constant under/over-dispersion in Poisson and binomial regression models [J] . Vanegas Luis Hernando, Rondon Luz Marina Journal of statistical computation and simulation . 2020,第10a12期

机译：数据转换，以处理泊松和二项式回归模型中的常量/过度分散
3. Road Fatality Model Based on Over-Dispersion Data Along Federal Route F0050 [J] . Wan Zahidah Musa, Joewono Prasetijo, Zaffan Farhana Zainal MATEC Web of Conferences . 2017,第1期

机译：基于联邦路线F0050过度分散数据的道路死亡模型
4. Modeling Over-Dispersion for Network Data Clustering [C] . Lu Wang, Dongxiao Zhu, Ming Dong, IEEE International Conference on Machine Learning and Applications . 2017

机译：为网络数据集群建模过度分散
5. Optimal clustering in wireless sensor networks employing different propagation models and data aggregation techniques [D] . Comeau, Frank 2008

机译：使用不同传播模型和数据聚合技术的无线传感器网络中的最佳集群
6. Improved Inference of Gene Regulatory Networks through Integrated Bayesian Clustering and Dynamic Modeling of Time-Course Expression Data [O] . Brian Godsey -1

机译：通过集成贝叶斯聚类和时间过程表达数据的动态建模改进了基因调控网络的推断
7. Some Inference Problems in Clustered (Longitudinal) Count Data with Over-dispersion [O] . Azad Kazi 2011

机译：过度分散的聚类（纵向）计数数据中的一些推理问题
8. Estimation and Testing in Poisson Regression Models with Over-Dispersion [R] . Poortema, K. 1992

机译：具有过度色散的泊松回归模型的估计和检验

Modeling Over-Dispersion for Network Data Clustering

摘要

著录项

相似文献

相关主题

期刊订阅