...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Towards improving fuzzy clustering using support vector machine: Application to gene expression data
【24h】

Towards improving fuzzy clustering using support vector machine: Application to gene expression data

机译:使用支持向量机改进模糊聚类:在基因表达数据中的应用

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Recent advancement in microarray technology permits monitoring of the expression levels of a large set of genes across a number of time points simultaneously. For extracting knowledge from such huge volume of microarray gene expression data, computational analysis is required. Clustering is one of the important data mining tools for analyzing such microarray data to group similar genes into clusters. Researchers have proposed a number of clustering algorithms in this purpose. In this article, an attempt has been made in order to improve the performance of fuzzy clustering by combining it with support vector machine (SVM) classifier. A recently proposed real-coded variable string length genetic algorithm based clustering technique and an iterated version of fuzzy C-means clustering have been utilized in this purpose. The performance of the proposed clustering scheme has been compared with that of some well-known existing clustering algorithms and their SVM boosted versions for one simulated and six real life gene expression data sets. Statistical significance test based on analysis of variance (ANOVA) followed by posteriori Tukey-Kramer multiple comparison test has been conducted to establish the statistical significance of the superior performance of the proposed clustering scheme. Moreover biological significance of the clustering solutions have been established.
机译:微阵列技术的最新进展允许跨多个时间点同时监视大量基因的表达水平。为了从如此大量的微阵列基因表达数据中提取知识,需要进行计算分析。聚类是分析此类微阵列数据以将相似基因分组的重要数据挖掘工具之一。为此,研究人员提出了许多聚类算法。在本文中,已尝试通过将其与支持向量机(SVM)分类器结合来提高模糊聚类的性能。为此,最近使用了一种基于实数编码的可变字符串长度遗传算法的聚类技术和一个模糊C均值聚类的迭代版本。拟议的聚类方案的性能已与一些已知的现有聚类算法及其针对一个模拟和六个现实基因表达数据集的SVM增强版本进行了比较。已经进行了基于方差分析(ANOVA)的统计显着性检验,然后进行了后验Tukey-Kramer多重比较检验,以建立所提出聚类方案优越性能的统计显着性。此外,已经确定了聚类解决方案的生物学意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号