...
首页> 外文期刊>Advances in computational sciences and technology >Degree Number based Search Algorithm for Local Outliers and Hubs using Minimum Spanning Tree
【24h】

Degree Number based Search Algorithm for Local Outliers and Hubs using Minimum Spanning Tree

机译:基于最小生成树的基于度数的局部离群值和集线器搜索算法

获取原文
获取原文并翻译 | 示例
           

摘要

Clustering is a process of discovering groups of objects such that the objects of the same group are similar, and objects belonging to different groups are dissimilar. Minimum spanning tree based clustering algorithm is capable of detecting clusters with irregular boundaries. Many algorithms find clusters by maximizing the number of intra-cluster edges. While such algorithms find useful and interesting structures, they tend to fail to identify and isolate two kinds of vertices that play special roles - vertices that bridge clusters (hubs) and vertices that are marginally connected to clusters (outliers). Identifying hubs is useful for applications such as viral marketing and epidemiology since hubs are responsible for spreading ideas and disease. ). In this paper, we model hubs as high-degree nodes having less Node Weight Factor (NWF) value. In this paper we propose a novel algorithm called Degree Number based Search Algorithm for Local outliers and hubs using Minimum Spanning Tree (DNSALOHMST), which find clusters and detect local outliers and hubs in data set. The algorithm partition the dataset into optimal number of clusters. The algorithm uses a new cluster validation criterion based on the geometric property of data partition of the data set in order to find the proper number of clusters. The algorithm works in two phases. The first phase of the algorithm creates optimal number of clusters, where as the second phase of the algorithm detect local outliers and hubs using degree number of the vertices in the cluster with a measure based on node weight factor.
机译:聚类是发现对象组的过程,以使同一组的对象相似,而属于不同组的对象则不相似。基于最小生成树的聚类算法能够检测具有不规则边界的聚类。许多算法通过最大化集群内边缘的数量来找到集群。尽管此类算法找到有用且有趣的结构,但它们往往无法识别和隔离起特殊作用的两种顶点-桥接簇(集线器)的顶点和与簇少量连接的顶点(离群值)。识别中心对于病毒营销和流行病学等应用非常有用,因为中心负责传播思想和疾病。 )。在本文中,我们将集线器建模为节点权重因子(NWF)值较小的高级节点。在本文中,我们提出了一种新的算法,即使用最小生成树(DNSALOHMST)的本地离群值和集线器基于度数的搜索算法,该算法可在数据集中找到簇并检测本地离群值和集线器。该算法将数据集划分为最佳数量的聚类。该算法基于数据集的数据分区的几何特性使用新的聚类验证标准,以找到适当数量的聚类。该算法分为两个阶段。该算法的第一阶段创建最佳数目的群集,其中,如该算法的第二阶段,使用群集中顶点的度数并基于节点权重因子的度量来检测局部离群值和集线器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号