首页> 外文期刊>Evolutionary Computation, IEEE Transactions on >Multiobjective Genetic Algorithm-Based Fuzzy Clustering of Categorical Attributes
【24h】

Multiobjective Genetic Algorithm-Based Fuzzy Clustering of Categorical Attributes

机译:基于多目标遗传算法的分类属性模糊聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Recently, the problem of clustering categorical data, where no natural ordering among the elements of a categorical attribute domain can be found, has been gaining significant attention from researchers. With the growing demand for categorical data clustering, a few clustering algorithms with focus on categorical data have recently been developed. However, most of these methods attempt to optimize a single measure of the clustering goodness. Often, such a single measure may not be appropriate for different kinds of datasets. Thus, consideration of multiple, often conflicting, objectives appears to be natural for this problem. Although we have previously addressed the problem of multiobjective fuzzy clustering for continuous data, these algorithms cannot be applied for categorical data where the cluster means are not defined. Motivated by this, in this paper a multiobjective genetic algorithm-based approach for fuzzy clustering of categorical data is proposed that encodes the cluster modes and simultaneously optimizes fuzzy compactness and fuzzy separation of the clusters. Moreover, a novel method for obtaining the final clustering solution from the set of resultant Pareto-optimal solutions in proposed. This is based on majority voting among Pareto front solutions followed by $k$-nn classification. The performance of the proposed fuzzy categorical data-clustering techniques has been compared with that of some other widely used algorithms, both quantitatively and qualitatively. For this purpose, various synthetic and real-life categorical datasets have been considered. Also, a statistical significance test has been conducted to establish the significant superiority of the proposed multiobjective approach.
机译:最近,聚类分类数据的问题已引起研究人员的极大关注,在分类数据中无法找到分类属性域的元素之间的自然顺序。随着对分类数​​据聚类的需求不断增长,最近已经开发了一些针对分类数据的聚类算法。但是,大多数这些方法都试图优化聚类优度的单个度量。通常,这种单一度量可能不适用于不同种类的数据集。因此,对于这个问题,考虑多个目标(通常是相互冲突的)似乎是很自然的。尽管我们先前已经解决了连续数据的多目标模糊聚类的问题,但是这些算法无法应用于未定义聚类平均值的分类数据。为此,本文提出了一种基于多目标遗传算法的分类数据模糊聚类方法,该方法对聚类模式进行编码,同时优化了聚类的模糊紧度和模糊分离。此外,提出了一种新的方法,该方法用于从一组所得的帕累托最优解中获取最终的聚类解。这基于Pareto前沿解决方案中的多数投票,然后进行$ k $ -nn分类。所提出的模糊分类数据聚类技术的性能已与其他一些广泛使用的算法进行了定量和定性的比较。为了这个目的,已经考虑了各种合成的和真实的分类数据集。此外,已经进行了统计显着性检验,以建立提出的多目标方法的显着优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号