...
首页> 外文期刊>Swarm and Evolutionary Computation >Analysis of particle swarm optimization based hierarchical data clustering approaches
【24h】

Analysis of particle swarm optimization based hierarchical data clustering approaches

机译:基于粒子群优化的分层数据聚类方法分析

获取原文
获取原文并翻译 | 示例
           

摘要

Data clustering is one of the most widely used data mining techniques, classifying similar data items into groups on the basis of similarity among the data items. Different issues have been observed while achieving the classification of data into the most suitable grouping. Efficiency of the clustering techniques and accuracy of the resulting groups are two of the main issues. To tackle these issues, recently, optimization based techniques have been used, resulting in enhanced quality of the output and improved efficiency of the clustering process. Swarm Intelligence (SI) is one such technique whose different algorithms have been found effective for this purpose. Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) are the two most prominent SI based techniques. In this paper we analyze the use of PSO for data clustering in particular for clustering in a hierarchical manner. We chose PSO based hierarchical techniques, Evolutionary PSO for clustering (EPSO-clustering) and Hierarchical PSO for clustering (HPSO-clustering). Both these techniques work in a hierarchical agglomerative manner, with HPSO-clustering an extension of EPSO-clustering. It combines the properties of hierarchical and partitional clustering and adds SI based optimization to the process. We evaluate our proposed clustering techniques on different benchmark datasets from UCI machine learning data repository as well as real data that we collected locally from a web server. We used inter-cluster and intra-cluster distances, and execution time to measure the performance of our proposed techniques. For evaluation we selected different clustering techniques that were previously used as benchmarks such as k-means, PSO-clustering, Hierarchical Agglomerative Clustering (HAC) and DBSCAN. The results verify that the proposed techniques perform better on the suggested measures against the benchmarks mentioned.
机译:数据聚类是使用最广泛的数据挖掘技术之一,它根据数据项之间的相似性将相似的数据项分为几类。在将数据分类到最合适的分组中时,已观察到不同的问题。聚类技术的效率和所得组的准确性是两个主要问题。为了解决这些问题,最近,使用了基于优化的技术,从而提高了输出质量,并提高了聚类过程的效率。群智能(SI)是一种这样的技术,已发现其不同的算法可有效用于此目的。粒子群优化(PSO)和蚁群优化(ACO)是基于SI的两个最突出的技术。在本文中,我们分析了PSO在数据聚类中的使用,尤其是在以分层方式进行聚类时。我们选择了基于PSO的分层技术,用于群集的演化PSO(EPSO群集)和用于群集的分层PSO(HPSO群集)。这两种技术都以分层的聚集方式工作,其中HPSO集群扩展了EPSO集群。它结合了分层和分区群集的属性,并在流程中添加了基于SI的优化。我们对来自UCI机器学习数据存储库的不同基准数据集,以及从Web服务器本地收集的真实数据,评估了我们提出的聚类技术。我们使用集群之间和集群内部的距离以及执行时间来衡量所提出技术的性能。为了进行评估,我们选择了以前用作基准的不同聚类技术,例如k均值,PSO聚类,分层聚类聚类(HAC)和DBSCAN。结果证实了所提出的技术在所提到的基准上在所建议的措施上表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号