Density-based clustering of big probabilistic graphs

Zahid Halim; Jamal Hussain Khattak

首页> 外文期刊>Evolving Systems >Density-based clustering of big probabilistic graphs

【24h】

Density-based clustering of big probabilistic graphs

机译：基于密度的大概率图聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering is a machine learning task to group similar objects in coherent sets. These groups exhibit similar behavior with-in their cluster. With the exponential increase in the data volume, robust approaches are required to process and extract clusters. In addition to large volumes, datasets may have uncertainties due to the heterogeneity of the data sources, resulting in the Big Data. Modern approaches and algorithms in machine learning widely use probability-theory in order to determine the data uncertainty. Such huge uncertain data can be transformed to a probabilistic graph-based representation. This work presents an approach for density-based clustering of big probabilistic graphs. The proposed approach deals with clustering of large probabilistic graphs using the graph’s density, where the clustering process is guided by the nodes’ degree and the neighborhood information. The proposed approach is evaluated using seven real-world benchmark datasets, namely proteinto- protein interaction, yahoo, movie-lens, core, last.fm, delicious social bookmarking system, and epinions. These datasets are first transformed to a graph-based representation before applying the proposed clustering algorithm. The obtained results are evaluated using three cluster validation indices, namely Davies–Bouldin index, Dunn index, and Silhouette coefficient. This proposal is also compared with four state-of-the-art approaches for clustering large probabilistic graphs. The results obtained using seven datasets and three cluster validity indices suggest better performance of the proposed approach.

机译：群集是一种机器学习任务，用于在连贯组中对类似的对象进行分组。这些群体表现出类似的行为与其群集。随着数据量的指数增加，需要强大的方法来处理和提取群集。除了大卷外，数据集可能具有由于数据源的异质性而具有不确定性，从而导致大数据。机器学习中的现代方法和算法广泛使用概率理论，以确定数据不确定性。这种巨大的不确定数据可以转换为基于概率图形的表示。这项工作提出了一种基于密度的大概率图形的方法。所提出的方法使用图形的密度来涉及大型概率图的聚类，其中聚类过程由节点的程度和邻域信息引导。使用七个现实世界基准数据集进行评估，即蛋白质互动，雅虎，电影镜，核心，最后一级.FM，美味社会书签系统和渗透。在应用所提出的聚类算法之前，首先将这些数据集转换为基于图形的表示。使用三个集群验证指数，即DAVIES-BOULDIN指数，DUNN指数和轮廓系数进行评估。该提案也与四种最先进的概率图进行了比较了四种最先进的方法。使用七个数据集和三个集群有效性指标获得的结果表明提出了所提出的方法的更好性能。

著录项

来源
《Evolving Systems》 |2019年第3期|共18页
作者
Zahid Halim; Jamal Hussain Khattak;
展开▼
作者单位

Faculty of Computer Science and Engineering Ghulam Ishaq Khan Institute of Engineering Sciences and Technology Topi Pakistan;

Faculty of Computer Science and Engineering Ghulam Ishaq Khan Institute of Engineering Sciences and Technology Topi Pakistan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自然科学总论;
关键词
Clustering graphs; Machine learning; Big graphs; Clustering; Community detection;

机译：聚类图;机器学习;大图;聚类;社区检测;

相似文献

外文文献
中文文献
专利

1. Density-based clustering of big probabilistic graphs [J] . Zahid Halim, Jamal Hussain Khattak Evolving Systems . 2019,第3期

机译：基于密度的大概率图聚类
2. Finding density-based subspace clusters in graphs with feature vectors [J] . Stephan Günnemann, Brigitte Boden, Thomas Seidl Data Mining and Knowledge Discovery . 2012,第2期

机译：在具有特征向量的图中寻找基于密度的子空间簇
3. Finding density-based subspace clusters in graphs with feature vectors [J] . Günnemann S., Boden B., Seidl T. Data mining and knowledge discovery . 2012,第2期

机译：在具有特征向量的图中寻找基于密度的子空间簇
4. Density-based probabilistic clustering of uncertain moving objects [C] . Huajie Xu, Xiaoming Hu, Bing Yang, IEEE International Conference on Intelligent Computing and Intelligent Systems;ICIS 2009 . 2009

机译：基于密度的不确定运动对象的概率聚类
5. Image reconstruction of muon tomographic data using a density-based clustering method. [D] . Perry, Kimberly B. 2015

机译：使用基于密度的聚类方法对μ子层析成像数据进行图像重建。
6. Stroke atlas of the brain: Voxel-wise density-based clustering of infarct lesions topographic distribution [O] . Yanlu Wang, Julia M. Juliano, Sook-Lei Liew, 2019

机译：脑卒中图集：基于体素的密度聚类的梗死灶地形分布
7. A Fast Algorithm for Identifying Density-Based Clustering Structures Using a Constraint Graph [O] . Jeong-Hun Kim, Jong-Hyeok Choi, Kwan-Hee Yoo, 2019

机译：一种快速算法，用于使用约束图识别基于密度的聚类结构

Density-based clustering of big probabilistic graphs

摘要

著录项

相似文献

相关主题

期刊订阅