[Objective] The aim of the study is to investigate a clustering method for clustering the data with missing values in practice research. [Method] The paper introduces a maximum likelihood-based dynamic clustering method, which could configure a complete data set through the maximum likelihood estimation for the missing by statistics of the others. The parameters of missing data and different clusters are estimated by the maximum likelihood method implemented via expectation-maximization (EM) algorithm and the objects are classified by the Bayesian posterior probability. [Result I The results of simulation studies show that the proposed method not only has fast convergence speed but also accurately cluster the data with missing values. [Conclusion] The proposed method was further validated by Fisher's Iris dataset. The result indicated that the proposed method had a significant advantage on clustering accuracy compared to the delete missing data arithmetic and it is similar to complete data clustering algorithm.%[目的]探讨实际问题研究中的不完全数据聚类.[方法]利用相关变量的辅助信息,对缺失数据进行推估,确定其合理的替代值,从而构造出一个“完全”数据集.在此基础上以EM算法循环迭代,参数的估计值和缺失数据的替代值都将逐渐收敛,以相应的贝叶斯后验概率判别个体的归类,进而实现动态聚类.[结果]模拟研究表明,缺值替代法具有较好的收敛性,对有缺失的数据基本都可正确地聚类.[结论]Fisher的鸢尾花花类识别数据验证了缺值替代法的可行性,其聚类的准确性高于缺值删除法,基本接近完全数据聚类.
展开▼