首页> 外文会议>2017 International Conference on Machine Learning and Data Science >INGC: Graph Clustering Outlier Detection Algorithm Using Label Propagation
【24h】

INGC: Graph Clustering Outlier Detection Algorithm Using Label Propagation

机译:INGC:使用标签传播的图聚类和离群值检测算法

获取原文
获取原文并翻译 | 示例

摘要

In the last decade, the size of data have increased at tremendous rate. To extract knowledgeable insights from this huge amount of data, data mining has to be done. To get the useful insights the connection in between data is sometimes of high interest. This connection can be efficiently represented as graphs. It provides an influential way to provide efficient illustrations for many applications spanning from biological networks, social networks to web networks. Graph mining techniques such as clustering and outlier detection can be beneficial in gathering the useful information. In this paper, an efficient influence based graph clustering and outlier detection algorithm (INGC) is proposed based on label propagation. The proposed algorithm improves the performance of the traditional Label Propagation algorithm by making it more robust. The proposed INGC saves time by labeling only high influential vertices of network. Further the labels are propagated among the rest of the nodes of network. And, the nodes with same vertex label are gathered to form a cluster. The vertices to which no label has been assigned during clustering are identified as outliers. Experiments were carried out on three real life graph datasets. It is shown that the proposed INGC outperforms the state-of art clustering algorithms in terms of F-Measure and Modularity. INGC also proved to be efficient in terms of detection rate of outliers.
机译:在过去的十年中,数据量以惊人的速度增长。为了从大量数据中提取知识渊博的见解,必须进行数据挖掘。为了获得有用的见解,有时会非常关注数据之间的联系。该连接可以有效地表示为图形。它提供了一种有效的方式,可以为从生物网络,社交网络到Web网络的许多应用程序提供有效的插图。图挖掘技术(例如聚类和离群值检测)在收集有用信息方面可能会有所帮助。本文提出了一种基于标签传播的基于影响力的有效图聚类和离群值检测算法。所提出的算法通过使其更加健壮,从而提高了传统标签传播算法的性能。提出的INGC通过仅标记网络的高影响力顶点来节省时间。此外,标签在网络的其余节点之间传播。并且,具有相同顶点标签的节点被收集以形成簇。在聚类期间未分配标签的顶点被标识为离群值。在三个现实生活中的图形数据集上进行了实验。结果表明,提出的INGC在F度量和模块化方面优于最新的聚类算法。 INGC在异常值的检测率方面也被证明是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号