Modeling Over-Dispersion for Network Data Clustering

机译：为网络数据集群建模过度分散

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters. While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters, commonly existing in network communities and image segments, from network data with over-dispersed cluster size distribution. The latter is considered as an intrinsic structural property of the network data. In this paper, we propose a generalized probabilistic modeling framework, SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data. A wide range of cluster size distributions revealed by real-world network data can be accurately captured by our method. We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments. Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.

机译：过度分散的网络数据挖掘已成为数据科学的中心主题，集群不平衡的实际网络数据量急剧增加就证明了这一点。虽然大多数现有的群集方法都是为发现群集的数量和特定于类的连接模式而设计的，但很少有方法可以从群集大小分布过于分散的网络数据中发现网络社区和图像段中普遍存在的不平衡群集。后者被视为网络数据的固有结构属性。在本文中，我们提出了一个通用的概率建模框架SizeConnectivity，以估计过度分散的群集大小分布以及来自网络数据的类特定的连接模式。通过我们的方法可以准确地捕获现实网络数据揭示的各种群集大小分布。我们对社交网络数据和图像数据进行了聚类和合成实验，以检测网络社区和图像片段。我们的结果证明了我们的SizeConnectivity聚类方法在通过建模过度分散来恢复网络数据的隐藏结构方面具有出色的性能。

著录项

来源
《IEEE International Conference on Machine Learning and Applications》|2017年|42-49|共8页
会议地点
作者
Lu Wang; Dongxiao Zhu; Ming Dong; Yan Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Data models; Clustering methods; Image segmentation; Probabilistic logic; Clustering algorithms; Mixture models; Image edge detection;

机译：数据模型聚类方法图像分割概率逻辑聚类算法混合模型图像边缘检测;

相似文献

外文文献
中文文献
专利

1. Testing homogeneity in clustered (longitudinal) count data regression model with over-dispersion [J] . Paul S., Azad K. Journal of Statistical Planning and Inference . 2012,第6期

机译：在具有过度分散的聚类（纵向）计数数据回归模型中测试同质性
2. A data transformation to deal with constant under/over-dispersion in Poisson and binomial regression models [J] . Vanegas Luis Hernando, Rondon Luz Marina Journal of statistical computation and simulation . 2020,第10a12期

机译：数据转换，以处理泊松和二项式回归模型中的常量/过度分散
3. Road Fatality Model Based on Over-Dispersion Data Along Federal Route F0050 [J] . Wan Zahidah Musa, Joewono Prasetijo, Zaffan Farhana Zainal MATEC Web of Conferences . 2017,第1期

机译：基于联邦路线F0050过度分散数据的道路死亡模型
4. Modeling Over-Dispersion for Network Data Clustering [C] . Lu Wang, Dongxiao Zhu, Ming Dong, IEEE International Conference on Machine Learning and Applications . 2017

机译：用于网络数据聚类的过度分散
5. Optimal clustering in wireless sensor networks employing different propagation models and data aggregation techniques [D] . Comeau, Frank 2008

机译：使用不同传播模型和数据聚合技术的无线传感器网络中的最佳集群
6. Improved Inference of Gene Regulatory Networks through Integrated Bayesian Clustering and Dynamic Modeling of Time-Course Expression Data [O] . Brian Godsey -1

机译：通过集成贝叶斯聚类和时间过程表达数据的动态建模改进了基因调控网络的推断
7. Some Inference Problems in Clustered (Longitudinal) Count Data with Over-dispersion [O] . Azad Kazi 2011

机译：过度分散的聚类（纵向）计数数据中的一些推理问题
8. Estimation and Testing in Poisson Regression Models with Over-Dispersion [R] . Poortema, K. 1992

机译：具有过度色散的泊松回归模型的估计和检验

Modeling Over-Dispersion for Network Data Clustering

摘要

著录项

相似文献

相关主题

期刊订阅