首页> 外文会议>International Computer Science and Engineering Conference >Tree-Based Hybrid Genetic Algorithm for Density-Based Data Clustering
【24h】

Tree-Based Hybrid Genetic Algorithm for Density-Based Data Clustering

机译:基于浓度的数据聚类的基于树的混合遗传算法

获取原文

摘要

Data clustering algorithms partition a given set of data points into groups containing very similar data points. Representative-based and density-based algorithms are generally used for data clustering. These algorithms are heuristic algorithms and may stuck at a sub-optimal clustering. Crisp clustering problem is a combinatorial optimization problem. Genetic Algorithms generally perform better than heuristic algorithms for combinatorial optimization. In this work, we propose a hybrid Genetic Algorithm for density-based clustering. For this purpose, we represent a cluster using a forest of trees, where the nodes of the trees are the data points. We use a tree-based fitness function. Beside 1-point crossover, we use a deterministic improvement of offspring. We implement the proposed algorithm using C language and run on a personal computer. We experiment with five datasets from UCI Machine Learning Repository. The proposed algorithm outperforms for both low and high-dimensional datasets over existing algorithms, except for one high-dimensional dataset.
机译:数据聚类算法将给定的数据集分成包含非常相似的数据点的组。基于代表性的基于密度的算法通常用于数据聚类。这些算法是启发式算法,并且可能粘在次优聚类。 CRISP聚类问题是组合优化问题。遗传算法通常比组合优化的启发式算法更好。在这项工作中,我们提出了一种用于密度基聚类的混合遗传算法。为此目的,我们代表使用树木森林的群集,其中树的节点是数据点。我们使用基于树的健身功能。除了1点交叉旁边,我们使用确定性改进后代。我们使用C语言实现所提出的算法并在个人计算机上运行。我们从UCI机器学习存储库中尝试五个数据集。除了一个高维数据集之外,所提出的算法对于现有算法,除了现有算法中的低和高维数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号