首页> 外文会议>International Multisymposium on Computer and Computational Sciences >A Genetic Weighted K-means Algorithm for Clustering GeneExpression Data
【24h】

A Genetic Weighted K-means Algorithm for Clustering GeneExpression Data

机译:用于聚类基因表达数据的遗传加权K型算法

获取原文

摘要

The traditional (unweighted) k-means is one of the most popular clustering methods for analyzing gene expression data. However, it suffers three major shortcomings. It is sensitive to initial partitions, its result is prone to the local minima, and it is only applicable to data with spherical-shape clusters. The last shortcoming means that we must assume that gene expression data at the different conditions follow the independent distribution with the same variances. Nevertheless, this assumption is not true in practice. In this paper, we propose a genetic weighted K-means algorithm (denoted by GWKMA), which solves the first two problems and partially remedies the third one. GWKMA is a hybridization of a genetic algorithm (GA) and a weighted K-means algorithm (WKMA). In GWKMA, each individual is encoded by a partitioning table which uniquely determines a clustering, and three genetic operators (selection, crossover, mutation) and a WKM operator derived from WKMA are employed. The superiority of the GWKMA over the k-means is illustrated on a synthetic and two real-life gene expression datasets. Keywords: Weighted k-means, clustering, partitional string, genetic algorithm, gene expression data
机译:传统(未加权的)K-Means是用于分析基因表达数据的最流行的聚类方法之一。但是,它遭受了三大缺点。它对初始分区敏感,其结果容易发生局部最小值,并且仅适用于具有球形簇的数据。最后的缺点意味着我们必须假设在不同条件下的基因表达数据遵循与相同的差异的独立分布。然而,这种假设在实践中并非如此。在本文中,我们提出了一种遗传加权K-Mean算法(由GWKMA表示),其解决了前两个问题,并且部分地补救了第三个问题。 GWKMA是遗传算法(GA)和加权K均值算法(WKMA)的杂交。在GWKMA中,每个单独的单独由分区表编码,该分区表唯一地确定聚类,并采用三种遗传运算符(选择,交叉,突变)和来自WKMA的WKM运算符。在合成和两个现实寿命基因表达数据集上示出了GWKMA通过K-max的优越性。关键词:加权K均值,聚类,分区串,遗传算法,基因表达数据

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号