...
首页> 外文期刊>Journal of the Operational Research Society >Picturing agreement between clustering solutions using multidimensional unfolding: An application to greenhouse gas emissions data
【24h】

Picturing agreement between clustering solutions using multidimensional unfolding: An application to greenhouse gas emissions data

机译:使用多维展开的聚类解决方案之间的图形化协议:在温室气体排放数据中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

When evaluating a clustering solution, we often have to compare alternative solutions - e.g., to address clustering stability or external validity. Each comparison essentially relies on a contingency table referring to a pair of (crisp) clustering solutions. These data is commonly used as an input to: (1) an assignment problem, to match the clusters of the two partitions; (2) determine several indices of agreement; (3) represent the two partitions in a two-dimensional map resorting to Correspondence Analysis. We propose using the Multidimensional Unfolding (MDU) technique to picture the cross-classification data between two partitions, complementing a clustering evaluation analysis and overcoming some limitations of the traditional approaches (1) to (3). This approach relies on a new similarity measure that excludes agreement between clusters due to chance alone. The resulting MDU map is very easy to interpret, picturing agreement between clustering solutions: the further apart are the clusters (represented by points) from the two partitions, the larger the (Euclidean) distances between the corresponding points. Two applications illustrate the relevance of this approach: an application to a data set on UCI Machine Learning Repository to access clustering external validity; and an application to greenhouse gas emissions data to address the temporal stability of clustering solutions, the clusters of European countries, which have homogeneous sources of pollutant emissions, being compared over three years.
机译:在评估聚类解决方案时,我们经常必须比较其他解决方案-例如,以解决聚类稳定性或外部有效性。每个比较基本上都依赖于列联表,该列表引用了一对(酥脆的)聚类解决方案。这些数据通常用作以下输入:(1)分配问题,以匹配两个分区的群集; (2)确定几个协议指标; (3)代表二维地图中的两个分区,这要归功于对应分析。我们建议使用多维展开(MDU)技术来描绘两个分区之间的交叉分类数据,以补充聚类评估分析并克服传统方法(1)至(3)的某些局限性。这种方法依赖于一种新的相似性度量,该度量排除了仅由于偶然性导致的聚类之间的一致性。生成的MDU映射非常易于解释,描绘出聚类解决方案之间的一致性:两个分区中的聚类(用点表示)越远,对应点之间的(欧几里得)距离就越大。有两个应用程序说明了这种方法的相关性:一个用于UCI机器学习存储库上的数据集的应用程序,用于访问集群外部有效性;并将其应用于温室气体排放数据以解决聚类解决方案的时间稳定性问题,欧洲国家的聚类具有均质的污染物排放源,并在三年内进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号