首页> 美国卫生研究院文献>Data in Brief >Clustering benchmark datasets exploiting the fundamental clustering problems
【2h】

Clustering benchmark datasets exploiting the fundamental clustering problems

机译:利用基本聚类问题对基准数据集进行聚类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The Fundamental Clustering Problems Suite (FCPS) offers a variety of clustering challenges that any algorithm should be able to handle given real-world data. The FCPS consists of datasets with known a priori classifications that are to be reproduced by the algorithm. The datasets are intentionally created to be visualized in two or three dimensions under the hypothesis that objects can be grouped unambiguously by the human eye. Each dataset represents a certain problem that can be solved by known clustering algorithms with varying success. In the R package “Fundamental Clustering Problems Suite” on CRAN, user-defined sample sizes can be drawn for the FCPS. Additionally, the distances of two high-dimensional datasets called Leukemia and Tetragonula are provided here. This collection is useful for investigating the shortcomings of clustering algorithms and the limitations of dimensionality reduction methods in the case of three-dimensional or higher datasets. This article is a simultaneous co-submission with Swarm Intelligence for Self-Organized Clustering [1].
机译:基本聚类问题套件(FCPS)提供了各种聚类挑战,任何算法都应该能够处理给定的真实数据。 FCPS由具有已知先验分类的数据集组成,该数据集将由算法重现。该数据集是有意创建的,可以在人眼可以明确地对对象进行分组的假设下以二维或三维形式可视化。每个数据集代表一个特定的问题,可以通过已知的聚类算法解决不同的成功。在CRAN上的R软件包“基本聚类问题套件”中,可以为FCPS绘制用户定义的样本大小。此外,此处提供了两个称为白血病和四角形藻的高维数据集的距离。该集合对于研究聚类算法的缺点以及在三维或更高维数据集的情况下降维方法的局限性很有用。本文是与Swarm Intelligence同时提交的,用于自组织集群[1]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号