首页> 外文会议>International symposium on multispectral image processing and pattern recognition >A GA-Based Clustering Algorithm for Large Data Sets With Mixed Numeric and Categorical Values
【24h】

A GA-Based Clustering Algorithm for Large Data Sets With Mixed Numeric and Categorical Values

机译:一种基于GA基础聚类算法,用于混合数字和分类值的大型数据集

获取原文
获取外文期刊封面目录资料

摘要

In the field of data mining, it is often encountered to perform cluster analysis on large data sets with mixed numeric and categorical values. However, most existing clustering algorithms are only efficient for the numeric data rather than the mixed data set. For this purpose, this paper presents a novel clustering algorithm for these mixed data sets by modifying the common cost function, trace of the within cluster dispersion matrix. The genetic algorithm (GA) is used to optimize the new cost function to obtain valid clustering result. Experimental result illustrates that the GA-based new clustering algorithm is feasible for the large data sets with mixed numeric and categorical values.
机译:在数据挖掘领域中,通常遇到在具有混合数字和分类值的大型数据集上执行群集分析。然而,大多数现有的聚类算法仅适用于数字数据而不是混合数据集。为此目的,本文通过修改群集分散矩阵内的常见成本函数,介绍了这些混合数据集的新型聚类算法。遗传算法(GA)用于优化新的成本函数以获得有效的群集结果。实验结果说明了基于GA的新聚类算法对于具有混合数字和分类值的大数据集是可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号