首页> 外文会议>International Conference on Ubiquitous Information Management and Communication >A Farthest First Traversal based Sampling Algorithm for k-clustering
【24h】

A Farthest First Traversal based Sampling Algorithm for k-clustering

机译:基于第一遍历K族的采样算法

获取原文

摘要

The farthest-first-traversal (fft) algorithm originally was used by Rosenkrantz et al. in an analysis of heuristics for the traveling salesman problem. This algorithm has been extensively studied for several sampling techniques. In this work, we present a modification of ProTraS algorithm given by Ros and Guillaume, which is also based on the fft algorithm, for sampling datasets for both k-means and k-median clustering algorithms. Unlike ProTraS, proposed algorithm takes the size of samples as an input. The algorithm is implemented in the Spark platform and tested for benchmark datasets. We also estimate the algorithm by comparing with the adaptive sampling and lightweight coreset algorithms, using the adjust Rand index.
机译:最初的第一遍历(FFT)算法最初是由Rosenkrantz等人使用的。在旅行推销员问题的分析中。该算法已经广泛研究了几种采样技术。在这项工作中,我们介绍了ROS和Guillaume给出的PROTRAS算法,该算法也基于FFT算法,用于k均值和k中位聚类算法的采样数据集。与PROTRAS不同,所提出的算法将样本的大小作为输入。该算法在Spark平台中实现并测试了基准数据集。我们还通过使用调整rand索引与自适应采样和轻量级Coreset算法进行比较来估算算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号