...
首页> 外文期刊>Journal of inequalities and applications >Performance of Rand’s C statistics in clustering analysis: an application to clustering the regions of Turkey
【24h】

Performance of Rand’s C statistics in clustering analysis: an application to clustering the regions of Turkey

机译:兰德C统计量在聚类分析中的表现:在土耳其地区聚类中的应用

获取原文
           

摘要

When a clustering problem is encountered, the researcher must be aware that choosing an incorrect clustering method and distance measure may significantly affect the results of the analysis. The purpose of this study is to determine the best clustering method and distance measure in cluster analysis and to cluster the regions of Turkey on the basis of this result. In hierarchical clustering, there are several clustering methods and distance measures. For comparison of the clustering methods and distance measures, Rand’s C statistic is one of the best methods. Rand’s comparative statistic C takes on values from 0.0 to 1.0 inclusive that may be used to compare two resultant clusterings produced by applying clustering methods to a data set with unknown structure or to assess the performance of a clustering method on a data set with known structure. In this study, the seven regions of Turkey are clustered by all the clustering methods and distance measures. Related with the social and economic indicators, the final cluster number is taken as three. Then, according to Rand’s C statistics, all possible pairs of distance measures for all clustering methods in hierarchical clustering are compared, and the results are given in the related tables. According to the results of all possible comparisons, Ward’s method is found to be the best among others, and final clustering of the regions is applied according to Ward’s clustering measure.
机译:当遇到聚类问题时,研究人员必须意识到选择不正确的聚类方法和距离度量可能会严重影响分析结果。本研究的目的是在聚类分析中确定最佳的聚类方法和距离度量,并根据此结果对土耳其地区进行聚类。在分层聚类中,有几种聚类方法和距离度量。为了比较聚类方法和距离度量,Rand的C统计量是最好的方法之一。兰德的比较统计数据C的取值范围为0.0到1.0(含),可用于比较通过将聚类方法应用于结构未知的数据集而产生的两个结果聚类,或用于评估结构已知的数据集的聚类方法的性能。在这项研究中,通过所有聚类方法和距离度量对土耳其的七个地区进行聚类。与社会经济指标相关,最终的集群数取为3。然后,根据Rand的C统计数据,比较分层聚类中所有聚类方法的所有可能的距离度量对,并在相关表中给出结果。根据所有可能比较的结果,发现Ward的方法是其他方法中最好的,并且根据Ward的聚类方法对区域进行了最终聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号