Imputation Strategies for Clustering Mixed-Type Data with Missing Values

Aschenbruck Rabea; Szepannek Gero; Wilhelm Adalbert F. X.

首页> 外文期刊>Journal of classification >Imputation Strategies for Clustering Mixed-Type Data with Missing Values

【24h】

Imputation Strategies for Clustering Mixed-Type Data with Missing Values

机译：Imputation Strategies for Clustering Mixed-Type Data with Missing Values

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相关主题

摘要

Incomplete data sets with different data types are difficult to handle, but regularly to be found in practical clustering tasks. Therefore in this paper, two procedures for clustering mixed-type data with missing values are derived and analyzed in a simulation study with respect to the factors of partition, prototypes, imputed values, and cluster assignment. Both approaches are based on the k-prototypes algorithm (an extension of k-means), which is one of the most common clustering methods for mixed-type data (i.e., numerical and categorical variables). For k-means clustering of incomplete data, the k-POD algorithm recently has been proposed, which imputes the missings with values of the associated cluster center. We derive an adaptation of the latter and additionally present a cluster aggregation strategy after multiple imputation. It turns out that even a simplified and time-saving variant of the presented method can compete with multiple imputation and subsequent pooling.

著录项

来源
《Journal of classification》 |2023年第1期|2-24|共23页
作者
Aschenbruck Rabea; Szepannek Gero; Wilhelm Adalbert F. X.;
展开▼
作者单位

Univ Appl Sci;

Jacobs Univ Bremen;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类自然科学理论与方法论;
关键词
Clustering; Imputation; Mixed-type data; Missing values; K-MEANS; INITIALIZATION;

Imputation Strategies for Clustering Mixed-Type Data with Missing Values

摘要

著录项

相关主题

期刊订阅