首页> 美国卫生研究院文献>PLoS Clinical Trials >Multi-Class Clustering of Cancer Subtypes through SVM Based Ensemble of Pareto-Optimal Solutions for Gene Marker Identification
【2h】

Multi-Class Clustering of Cancer Subtypes through SVM Based Ensemble of Pareto-Optimal Solutions for Gene Marker Identification

机译:通过基于SVM的Pareto最优解决方案组合对癌症亚型进行多类聚类以进行基因标记识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

With the advancement of microarray technology, it is now possible to study the expression profiles of thousands of genes across different experimental conditions or tissue samples simultaneously. Microarray cancer datasets, organized as samples versus genes fashion, are being used for classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer types. In this article, we have presented an unsupervised cancer classification technique based on multiobjective genetic clustering of the tissue samples. In this regard, a real-coded encoding of the cluster centers is used and cluster compactness and separation are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. A novel approach to combine the clustering information possessed by the non-dominated solutions through Support Vector Machine (SVM) classifier has been proposed. Final clustering is obtained by consensus among the clusterings yielded by different kernel functions. The performance of the proposed multiobjective clustering method has been compared with that of several other microarray clustering algorithms for three publicly available benchmark cancer datasets. Moreover, statistical significance tests have been conducted to establish the statistical superiority of the proposed clustering method. Furthermore, relevant gene markers have been identified using the clustering result produced by the proposed clustering method and demonstrated visually. Biological relationships among the gene markers are also studied based on gene ontology. The results obtained are found to be promising and can possibly have important impact in the area of unsupervised cancer classification as well as gene marker identification for multiple cancer subtypes.
机译:随着微阵列技术的进步,现在有可能同时研究不同实验条件或组织样品中数千种基因的表达谱。将微阵列癌数据集按样本与基因的方式组织起来,用于将组织样本分类为良性和恶性或其亚型。它们还可用于识别每种癌症亚型的潜在基因标记,这有助于成功诊断特定癌症类型。在本文中,我们介绍了一种基于组织样本多目标遗传聚类的无监督癌症分类技术。在这方面,使用集群中心的实码编码,并且集群的紧凑性和分离性同时得到优化。所得的近似帕累托最优解集包含许多非支配解。提出了一种通过支持向量机(SVM)分类器组合非支配解所具有的聚类信息的新方法。最终的聚类是通过不同内核函数产生的聚类之间的共识而获得的。所提出的多目标聚类方法的性能已与三个公开的基准癌症数据集的其他几种微阵列聚类算法的性能进行了比较。此外,已经进行了统计显着性检验以建立所提出的聚类方法的统计优越性。此外,已使用提出的聚类方法产生的聚类结果鉴定了相关的基因标记,并在视觉上进行了展示。还基于基因本体研究了基因标记之间的生物学关系。发现获得的结果是有希望的,并且可能在无监督的癌症分类以及多种癌症亚型的基因标记识别领域中具有重要影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号