首页> 外文会议>Multispectral Image Processing and Pattern Recognition >A GA-Based Clustering Algorithm for Large Data Sets With Mixed Numeric and Categorical Values
【24h】

A GA-Based Clustering Algorithm for Large Data Sets With Mixed Numeric and Categorical Values

机译:混合数值和分类值的大数据集基于GA的聚类算法

获取原文

摘要

In the field of data mining, it is often encountered to perform cluster analysis on large data sets with mixed numeric and categorical values. However, most existing clustering algorithms are only efficient for the numeric data rather than the mixed data set. For this purpose, this paper presents a novel clustering algorithm for these mixed data sets by modifying the common cost function, trace of the within cluster dispersion matrix. The genetic algorithm (GA) is used to optimize the new cost function to obtain valid clustering result. Experimental result illustrates that the GA-based new clustering algorithm is feasible for the large data sets with mixed numeric and categorical values.
机译:在数据挖掘领域,经常遇到对具有混合数值和分类值的大型数据集执行聚类分析的情况。但是,大多数现有的聚类算法仅对数字数据有效,而对混合数据集无效。为此,本文通过修改通用成本函数,簇内色散矩阵内的迹线,为这些混合数据集提出了一种新颖的聚类算法。遗传算法(GA)用于优化新的成本函数以获得有效的聚类结果。实验结果表明,基于遗传算法的新聚类算法适用于数值和分类值混合的大数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号