首页> 外文会议>Uncertainty in Artificial Intelligence >An Information-Theoretic External Cluster-Validity Measure
【24h】

An Information-Theoretic External Cluster-Validity Measure

机译:信息论外部集群有效性测度

获取原文
获取外文期刊封面目录资料

摘要

In this paper we propose a measure of similarity/association between two partitions of a set of objects. Our motivation is the desire to use the measure to characterize the quality or accuracy of clustering algorithms by somehow comparing the clusters they produce with "ground truth" consisting of classes assigned by manual means or some other means in whose veracity there is confidence. Such measures are referred to as "external". Our measure also allows clusterings with different numbers of clusters to be compared in a quantitative and principled way. Our evaluation scheme quantitatively measures how useful the cluster labels are as predictors of their class labels. It computes the reduction in the number of bits that would be required to encode (compress) the class labels if both the encoder and decoder have free access to the cluster labels. To achieve this encoding the estimated conditional probabilities of the class labels given the cluster labels must also be encoded. In addition to defining the measure we compare it to other commonly used external measures and demonstrate its superiority as judged by certain criteria.
机译:在本文中,我们提出了一组对象的两个分区之间的相似性/关联性的度量。我们的动机是希望通过某种方式将聚类算法产生的聚类与“基本事实”相比较来表征聚类算法的质量或准确性,“真实性”包括通过手动方式或其他方法确定的准确性。这种措施被称为“外部”。我们的措施还允许以定量和原则化的方式比较具有不同数量群集的群集。我们的评估方案定量评估了群集标签作为其类标签的预测变量的有用程度。如果编码器和解码器都可以自由访问群集标签,则它将计算对类标签进行编码(压缩)所需的位数的减少。为了实现这种编码,还必须对给定聚类标签的类标签的估计条件概率进行编码。除了定义度量外,我们还将其与其他常用外部度量进行比较,并证明其按某些标准判断的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号