首页> 外文学位 >A theoretical study of clusterability and clustering quality.
【24h】

A theoretical study of clusterability and clustering quality.

机译:可聚性和聚类质量的理论研究。

获取原文
获取原文并翻译 | 示例

摘要

Clustering is a widely used technique, with applications ranging from data mining, bioinformatics and image analysis to marketing, psychology, and city planning. Despite the practical importance of clustering, there is very limited theoretical analysis of the topic. We make a step towards building theoretical foundations for clustering by carrying out an abstract analysis of two central concepts in clustering; clusterability and clustering quality.;We compare a number of notions of clusterability found in the literature. While all these notions attempt to measure the same property, and all appear to be reasonable, we show that they are pairwise inconsistent. In addition, we give the first computational complexity analysis of a few notions of clusterability.;In the second part of the thesis, we discuss how the quality of a given clustering can be defined (and measured). Users often need to compare the quality of clusterings obtained by different methods. Perhaps more importantly, users need to determine whether a given clustering is sufficiently good for being used in further data mining analysis. We analyze what a measure of clustering quality should look like. We do that by introducing a set of requirements ('axioms') of clustering quality measures. We propose a number of clustering quality measures that satisfy these requirements.
机译:聚类是一种广泛使用的技术,其应用范围从数据挖掘,生物信息学和图像分析到市场营销,心理学和城市规划。尽管聚类具有实际的重要性,但是对该主题的理论分析非常有限。我们通过对聚类中的两个中心概念进行抽象分析,为建立聚类的理论基础迈出了一步。聚类性和聚类质量。;我们比较了文献中发现的许多聚类性概念。尽管所有这些概念都试图衡量相同的属性,而且似乎都是合理的,但我们证明它们是成对不一致的。此外,我们首先对一些可聚类性概念进行了计算复杂性分析。在论文的第二部分中,我们讨论了如何定义(和测量)给定聚类的质量。用户经常需要比较通过不同方法获得的聚类的质量。也许更重要的是,用户需要确定给定的聚类是否足以用于进一步的数据挖掘分析。我们分析了集群质量的度量应该是什么样子。为此,我们引入了一组集群质量度量的要求(“原则”)。我们提出了许多满足这些要求的聚类质量度量。

著录项

  • 作者

    Ackerman, Margareta.;

  • 作者单位

    University of Waterloo (Canada).;

  • 授予单位 University of Waterloo (Canada).;
  • 学科 Computer Science.
  • 学位 M.Math.
  • 年度 2008
  • 页码 76 p.
  • 总页数 76
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术 ;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号