首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number
【24h】

Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number

机译:基于统一相似性度量的分类和数字属性数据聚类,而无需知道聚类编号

获取原文
获取原文并翻译 | 示例
           

摘要

Most of the existing clustering approaches are applicable to purely numerical or categorical data only, but not the both. In general, it is a nontrivial task to perform clustering on mixed data composed of numerical and categorical attributes because there exists an awkward gap between the similarity metrics for categorical and numerical data. This paper therefore presents a general clustering framework based on the concept of object-cluster similarity and gives a unified similarity metric which can be simply applied to the data with categorical, numerical, and mixed attributes. Accordingly, an iterative clustering algorithm is developed, whose outstanding performance is experimentally demonstrated on different benchmark data sets. Moreover, to circumvent the difficult selection problem of cluster number, we further develop a penalized competitive learning algorithm within the proposed clustering framework. The embedded competition and penalization mechanisms enable this improved algorithm to determine the number of clusters automatically by gradually eliminating the redundant clusters. The experimental results show the efficacy of the proposed approach.
机译:大多数现有的聚类方法仅适用于纯数字或分类数据,而不适用于两者。通常,对由数值和类别属性组成的混合数据执行聚类是一项艰巨的任务,因为在类别和数值数据的相似性度量之间存在一个尴尬的差距。因此,本文提出了一种基于对象-集群相似性概念的通用聚类框架,并给出了一个统一的相似性度量,可以将其简单地应用于具有分类,数值和混合属性的数据。因此,开发了一种迭代聚类算法,在不同的基准数据集上实验证明了其出色的性能。此外,为了避免集群数的选择困难问题,我们在提出的集群框架内进一步开发了一种惩罚性竞争学习算法。嵌入式竞争和惩罚机制使这种改进的算法能够通过逐渐消除冗余集群来自动确定集群数量。实验结果表明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号