首页> 外文期刊>International journal of computational i >High dimensional data clustering through fuzzy possibilistic C-means with symmetry-based distance measure
【24h】

High dimensional data clustering through fuzzy possibilistic C-means with symmetry-based distance measure

机译:基于对称距离度量的模糊可能性C均值高维数据聚类

获取原文
获取原文并翻译 | 示例
           

摘要

One of the difficult tasks in data clustering is clustering the high dimensional data. Clustering high dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signifies that in case of high dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy C-means (FCM) and possibilistic C-means (PCM) has the capability to handle the high dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialisation to converge to nearly global minimum. Hence to overcome this issue a fuzzy possibilistic C-means (FPCM) with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. In addition with a good fuzzy partitioning of the data, a novel fuzzy cluster validity index called FSym-index is used which depends on the symmetry-based distance. Symmetry-based distance provides a measure of integrity of clustering on several fuzzy partitions of a dataset. If the value of FSym-index is larger, the accuracy also becomes high with less execution time.
机译:数据聚类中的困难任务之一是聚类高维数据。由于数据点的内在稀疏性,对高维数据进行聚类已成为主要问题。最近的一些研究结果表明,在高维数据的情况下,即使接近或聚类的概念也可能不重要。模糊C均值(FCM)和可能C均值(PCM)具有处理高维数据的能力,而FCM对噪声敏感,并且PCM需要适当的初始化才能收敛到几乎全局最小值。因此,为了克服这个问题,已经提出了一种基于对称距离度量的模糊可能性C均值(FPCM),它可以找出数据集中存在的聚类数目。除了对数据进行良好的模糊划分外,还使用了一种新的模糊聚类有效性指数FSym-index,它取决于基于对称性的距离。基于对称的距离提供了数据集几个模糊分区上聚类完整性的度量。如果FSym-index的值较大,则执行时间更少,精度也很高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号