首页> 美国卫生研究院文献>other >SPECTRAL CLUSTERING STRATEGIES FOR HETEROGENEOUS DISEASE EXPRESSION DATA
【2h】

SPECTRAL CLUSTERING STRATEGIES FOR HETEROGENEOUS DISEASE EXPRESSION DATA

机译:谱聚类策略异质性疾病表达数据

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Clustering of gene expression data simplifies subsequent data analyses and forms the basis of numerous approaches for biomarker identification, prediction of clinical outcome, and personalized therapeutic strategies. The most popular clustering methods such as K-means and hierarchical clustering are intuitive and easy to use, but they require arbitrary choices on their various parameters (number of clusters for K-means, and a threshold to cut the tree for hierarchical clustering). Human disease gene expression data are in general more difficult to cluster efficiently due to background (genotype) heterogeneity, disease stage and progression differences and disease subtyping; all of which cause gene expression datasets to be more heterogeneous. Spectral clustering has been recently introduced in many fields as a promising alternative to standard clustering methods. The idea is that pairwise comparisons can help reveal global features through the eigen techniques. In this paper, we developed a new recursive K-means spectral clustering method (ReKS) for disease gene expression data. We benchmarked ReKS on three large-scale cancer datasets and we compared it to different clustering methods with respect to execution time, background models and external biological knowledge. We found ReKS to be superior to the hierarchical methods and equally good to K-means, but much faster than them and without the requirement for a priori knowledge of K. Overall, ReKS offers an attractive alternative for efficient clustering of human disease data.
机译:基因表达数据的聚类简化了后续的数据分析,并形成了许多用于生物标志物鉴定,临床结果预测和个性化治疗策略的方法的基础。最受欢迎的聚类方法(例如K-means和分层聚类)直观且易于使用,但是它们需要在其各种参数(K-means的聚类数量,以及为层次聚类切割树的阈值)上进行任意选择。由于背景(基因型)异质性,疾病阶段和进展差异以及疾病亚型,一般而言,人类疾病基因表达数据更难有效地聚类。所有这些导致基因表达数据集更加异质。光谱聚类最近已在许多领域引入,作为标准聚类方法的有希望的替代方法。这个想法是,成对比较可以通过本征技术帮助揭示全局特征。在本文中,我们为疾病基因表达数据开发了一种新的递归K-均值谱聚类方法(ReKS)。我们在三个大型癌症数据集上对ReKS进行了基准测试,并将其与执行时间,背景模型和外部生物学知识方面的不同聚类方法进行了比较。我们发现ReKS优于分层方法,并且同样优于K-means,但比它们快得多,并且不需要先验知识K。总的来说,ReKS为有效的人类疾病数据聚类提供了一种有吸引力的替代方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号