【24h】

Data Clustering: Algorithms and Its Applications

机译:数据聚类:算法及其应用

获取原文
获取外文期刊封面目录资料

摘要

Data is useless if information or knowledge that can be used for further reasoning cannot be inferred from it. Cluster analysis, based on some criteria, shares data into important, practical or both categories (clusters) based on shared common characteristics. In research, clustering and classification have been used to analyze data, in the field of machine learning, bioinformatics, statistics, pattern recognition to mention a few. Different methods of clustering include Partitioning (K-means), Hierarchical (AGNES), Density-based (DBSCAN), Grid-based (STING), Soft clustering (FANNY), Model-based (SOM) and Ensemble clustering. Challenges and problems in clustering arise from large datasets, misinterpretation of results and efficiency/performance of clustering algorithms, which is necessary for choosing clustering algorithms. In this paper, application of data clustering was systematically discussed in view of the characteristics of the different clustering techniques that make them better suited or biased when applied to several types of data, such as uncertain data, multimedia data, graph data, biological data, stream data, text data, time series data, categorical data and big data. The suitability of the available clustering algorithms to different application areas was presented. Also investigated were some existing cluster validity methods used to evaluate the goodness of the clusters produced by the clustering algorithms.
机译:如果无法从中推断出可用于进一步推理的信息或知识,则数据将无用。基于某些标准的聚类分析根据共享的共同特征将数据共享到重要,实用或两个类别(集群)中。在研究中,聚类和分类已用于分析数据,在机器学习,生物信息学,统计,模式识别等领域中仅举几例。聚类的不同方法包括分区(K-means),分层(AGNES),基于密度(DBSCAN),基于网格(STING),软聚类(FANNY),基于模型(SOM)和集成聚类。聚类的挑战和问题来自大型数据集,对结果的误解以及聚类算法的效率/性能,这对于选择聚类算法是必需的。本文针对各种数据聚类技术的特点,系统地讨论了数据聚类的应用,这些特性使它们在应用于多种类型的数据(例如不确定数据,多媒体数据,图形数据,生物数据,流数据,文本数据,时间序列数据,分类数据和大数据。提出了可用的聚类算法对不同应用领域的适用性。还研究了一些现有的聚类有效性方法,用于评估由聚类算法产生的聚类的优良性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号