首页> 外文期刊>Journal of classification >Imputation Strategies for Clustering Mixed-Type Data with Missing Values
【24h】

Imputation Strategies for Clustering Mixed-Type Data with Missing Values

机译:Imputation Strategies for Clustering Mixed-Type Data with Missing Values

获取原文
获取原文并翻译 | 示例
       

摘要

Incomplete data sets with different data types are difficult to handle, but regularly to be found in practical clustering tasks. Therefore in this paper, two procedures for clustering mixed-type data with missing values are derived and analyzed in a simulation study with respect to the factors of partition, prototypes, imputed values, and cluster assignment. Both approaches are based on the k-prototypes algorithm (an extension of k-means), which is one of the most common clustering methods for mixed-type data (i.e., numerical and categorical variables). For k-means clustering of incomplete data, the k-POD algorithm recently has been proposed, which imputes the missings with values of the associated cluster center. We derive an adaptation of the latter and additionally present a cluster aggregation strategy after multiple imputation. It turns out that even a simplified and time-saving variant of the presented method can compete with multiple imputation and subsequent pooling.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号