【24h】

A New Algorithm for Fuzzy Clustering Handling Incomplete Dataset

机译:模糊聚类处理不完整数据集的新算法

获取原文
获取原文并翻译 | 示例
           

摘要

One of the most difficult problems in cluster analysis is the identification of the number of groups in a dataset especially in the presence of missing value. Since traditional clustering methods assumed the real number of clusters to be known. However, in real world applications the number of clusters is generally not known a priori. Also, most of clustering methods were developed to analyse complete datasets, they cannot be applied to many practical problems, e.g., on incomplete data. This paper focuses, first, on an algorithm of a fuzzy clustering approach, called OCS-FSOM. The proposed algorithm is based on neural network and uses Optimal Completion Strategy for missing value estimation in incomplete dataset. Then, we propose an extension of our algorithm, to tackle the problem of estimating the number of clusters, by using a multi level OCS-FSOM method. The new algorithm called Multi-OCSFSOM is able to find the optimal number of clusters by using a statistical criterion, that aims at measuring the quality of obtained partitions. Carried out experiments on real-life datasets highlights a very encouraging results in terms of exact determination of optimal number of clusters.
机译:聚类分析中最困难的问题之一是识别数据集中的组数,尤其是在缺少值的情况下。由于传统的聚类方法假定已知簇的实际数量。然而,在实际应用中,簇的数量通常不是先验的。而且,大多数聚类方法都是为了分析完整的数据集而开发的,它们不能应用于许多实际问题,例如不完整的数据。本文首先关注一种称为OCS-FSOM的模糊聚类方法算法。该算法基于神经网络,并采用最优完成策略对不完整数据集的缺失值进行估计。然后,我们提出了我们算法的扩展,以解决使用多级OCS-FSOM方法估计簇数的问题。名为Multi-OCSFSOM的新算法能够通过使用统计标准来找到最佳群集数,该统计标准旨在测量获得的分区的质量。在真实数据集上进行的实验从准确确定最佳群集数的角度突出了非常令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号