首页> 外文期刊>BMC Bioinformatics >Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study
【24h】

Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study

机译:SOM的聚类很容易揭示不同的基因表达模式:淋巴瘤研究的重新分析结果

获取原文
       

摘要

Background A method to evaluate and analyze the massive data generated by series of microarray experiments is of utmost importance to reveal the hidden patterns of gene expression. Because of the complexity and the high dimensionality of microarray gene expression profiles, the dimensional reduction of raw expression data and the feature selections necessary for, for example, classification of disease samples remains a challenge. To solve the problem we propose a two-level analysis. First self-organizing map (SOM) is used. SOM is a vector quantization method that simplifies and reduces the dimensionality of original measurements and visualizes individual tumor sample in a SOM component plane. Next, hierarchical clustering and K-means clustering is used to identify patterns of gene expression useful for classification of samples. Results We tested the two-level analysis on public data from diffuse large B-cell lymphomas. The analysis easily distinguished major gene expression patterns without the need for supervision: a germinal center-related, a proliferation, an inflammatory and a plasma cell differentiation-related gene expression pattern. The first three patterns matched the patterns described in the original publication using supervised clustering analysis, whereas the fourth one was novel. Conclusions Our study shows that by using SOM as an intermediate step to analyze genome-wide gene expression data, the gene expression patterns can more easily be revealed. The "expression display" by the SOM component plane summarises the complicated data in a way that allows the clinician to evaluate the classification options rather than giving a fixed diagnosis.
机译:背景技术评估和分析一系列微阵列实验产生的大量数据的方法对于揭示基因表达的隐藏模式至关重要。由于微阵列基因表达谱的复杂性和高维性,原始表达数据的维数减少和例如疾病样本分类所必需的特征选择仍然是一个挑战。为了解决这个问题,我们提出了两个层次的分析。首先使用自组织图(SOM)。 SOM是一种矢量量化方法,可简化和减少原始测量的维数,并在SOM组件平面中可视化单个肿瘤样本。接下来,使用层次聚类和K-均值聚类来识别可用于样品分类的基因表达模式。结果我们对来自弥漫性大B细胞淋巴瘤的公共数据进行了两级分析。该分析无需监督即可轻松区分主要基因表达模式:生发中心相关,增殖,炎症和浆细胞分化相关基因表达模式。前三个模式与使用监督聚类分析的原始出版物中描述的模式相匹配,而第四个模式是新颖的。结论我们的研究表明,通过使用SOM作为分析全基因组基因表达数据的中间步骤,可以更轻松地揭示基因表达模式。 SOM组件平面的“表达式显示”以允许临床医生评估分类选项而不是给出固定诊断的方式汇总了复杂的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号