...
首页> 外文期刊>Current Bioinformatics >Identification of Robust Clustering Methods in Gene Expression Data Analysis
【24h】

Identification of Robust Clustering Methods in Gene Expression Data Analysis

机译:基因表达数据分析中鲁棒聚类方法的鉴定

获取原文
获取原文并翻译 | 示例

摘要

Background: Cluster analysis techniques of gene expression microarray data is of increasing interest in the field of current bioinformatics. One of the reasons for this is the need for molecular-based refinement of broadly defined biological classes, with implications in cancer diagnosis,prognosis and treatment. And many algorithms have been developed for this problem. Objective: However microarray data frequently include outliers, and how to treat these outlier's effects in the subsequent analysis-clustering. Method: In this paper, we present the large-scale analysisof seven different agglomerative hierarchical clustering methods and five proximity measures for the analysis of 33 cancer gene expression datasets. As a case study, we used two experimental datasets: Affymetrix and cDNA, and different percent outliers were artificially added to these datasets. Results: We found that ward method gives the highest corrected Rand index value with respect to the spearman proximity measures when datasets contain with and without outliers. Conclusion: This study proves that ward method is more robust clustering methods in gene expression dataanalysis among other methods.
机译:背景:基因表达微阵列数据的聚类分析技术对当前生物信息学领域的兴趣越来越大。其中一个原因是需要对广泛定义的生物类进行分子的细化,具有癌症诊断,预后和治疗的影响。已经为此问题开发了许多算法。目的:常规数据常常包括异常值,以及如何在随后的分析聚类中处理这些异常值。方法:在本文中,我们介绍了七种不同附注分层聚类方法的大规模分析和用于分析33个癌症基因表达数据集的五个接近措施。作为一个案例研究,我们使用了两个实验数据集:Affymetrix和cDNA,并且不同的异常值是人工添加到这些数据集中。结果:我们发现,当数据集包含和没有异常值的数据集时,沃德方法给出了Spearman接近度量的最高校正rand指标值。结论:本研究证明病房方法在其他方法中的基因表达数据分析中是更鲁棒的聚类方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号