首页> 外文期刊>International Journal of Knowledge-Based in Intelligent Engineering Systems >Novel multi-centroid, multi-run sampling schemes for K-medoids-based algorithms
【24h】

Novel multi-centroid, multi-run sampling schemes for K-medoids-based algorithms

机译:基于K-medoids算法的新型多质心,多运行采样方案

获取原文
获取原文并翻译 | 示例
           

摘要

Clustering in data mining is used to group similar objects based on their distance, connectivity, relative density, or some specific characteristics. Data clustering has become an important task for discovering significant patterns and characteristics in large spatial databases. The k-medoids-based algorithms have been shown to be effective to spherical-shaped clusters with outliers. However, they are not efficient for large database. In this paper, we propose two novel algorithms - Multi-Centroid with Multi-Run Sampling Scheme, which we termed MCMRS, and a more advanced sampling scheme termed the Incremental Multi-Centroid, Multi-Run Sampling Scheme, which called simply (IMCMRS) hereafter, to improve the performance of many k-medoids-based algorithms including PAM, CLARA and CLARANS. Experimental results demonstrate the proposed scheme can not only reduce by more than 80% computation time but also reduce the average distance per object compared with CLARA and CLARANS. IMCMRS is also superior to MCMRS.
机译:数据挖掘中的聚类用于根据相似对象的距离,连接性,相对密度或某些特定特征对其进行分组。数据聚类已成为发现大型空间数据库中重要模式和特征的重要任务。基于k-medoids的算法已被证明对具有离群值的球形聚类有效。但是,它们对于大型数据库而言效率不高。在本文中,我们提出了两种新颖的算法-被称为MCMRS的具有多重运行采样方案的多中心采样方案,以及被称为“增量多中心,多重运行采样方案”的更高级的采样方案,简称为(IMCMRS)此后,为了提高许多基于k-medoids的算法(包括PAM,CLARA和CLARANS)的性能。实验结果表明,与CLARA和CLARANS相比,该方案不仅可以减少80%以上的计算时间,而且可以减少每个物体的平均距离。 IMCMRS也优于MCMRS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号