首页> 外文会议>International Conference on Modeling Decisions for Artificial Intelligence >k-CCM: A Center-Based Algorithm for Clustering Categorical Data with Missing Values
【24h】

k-CCM: A Center-Based Algorithm for Clustering Categorical Data with Missing Values

机译:K-CCM:一种基于中心的算法,用于捕获具有缺失值的分类数据

获取原文

摘要

This paper focuses on solving the problem of clustering for categorical data with missing values. Specifically, we design a new framework that can impute missing values and assign objects into appropriate clusters. For the imputation step, we use a decision tree-based method to fill in missing values. For the clustering step, we use a kernel density estimation approach to define cluster centers and an information theoretic-based dissimilarity measure to quantify the differences between objects. Then, we propose a center-based algorithm for clustering categorical data with missing values, namely k-CCM. An experimental evaluation was performed on real-life datasets with missing values to compare the performance of the proposed algorithm with other popular clustering algorithms in terms of clustering quality. Generally, the experimental result shows that the proposed algorithm has a comparative performance when compared to other algorithms for all datasets.
机译:本文侧重于解决与缺失值的分类数据群集的问题。具体来说,我们设计了一个可以释放缺失值的新框架,并将对象分配给适当的群集。对于估算步骤,我们使用基于决策树的方法来填补缺失值。对于群集步骤,我们使用内核密度估计方法来定义集群中心和基于信息的信息,以量化对象之间的差异。然后,我们提出了一种基于中心的算法,用于群集具有缺失值的分类数据,即K-CCM。在具有缺失值的实际数据集上执行实验评估,以比较群体质量方面与其他流行聚类算法的提出算法的性能。通常,实验结果表明,与所有数据集的其他算法相比,所提出的算法具有比较性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号