首页> 外文会议>International Conference on Data Mining VI; 2005; Skiathos(GR) >Pleiotropic microarray gene expression data: advanced tandem multivariate data mining
【24h】

Pleiotropic microarray gene expression data: advanced tandem multivariate data mining

机译:多效性微阵列基因表达数据:高级串联多元数据挖掘

获取原文
获取原文并翻译 | 示例

摘要

Massive amounts of data are produced through microarray gene expression analysis, and commonly employed techniques include cluster analysis computed from Pearson correlation coefficients. Several sophisticated multivariate techniques are based upon decomposition of correlation matrices and their resultant numeric products. Principal components analysis (PCA) is based upon correlation matrix decomposition, and has been used to analyze microarray data. However, rotated factor analysis has not been explored extensively. Previously published data on 42,427 genes that were analyzed using cluster analysis are used in the present analysis. The data are from experiments to analyze global microarray gene expression from embryos at three stages of development: days one, two, and three post-fertilization. The previously published analysis used cluster analysis to correctly classify observations by stage/day based on gene expression. Data on 22,561 genes were suitable for further multivariate analysis. In the present investigation, quartimax rotated factor analysis was used to extract five factors that paralleled the cluster analysis, with days of egg and embryo development loading on separate factors. Factor scores were computed for each gene on the five factors, and used for modified gene shaving, or SVD (singular value decomposition). This identified supergenes that were responsible for the majority of variance across all five factors. Path analysis of factor scores suggested five genes might be pleiotropic or regulatory. This proof of concept numerical analysis provides the basis for development of more sophisticated multivariate analytical techniques for microarray data than cluster analysis to evaluate causal paths of pleiotropic control of gene expression.
机译:通过微阵列基因表达分析可产生大量数据,常用的技术包括根据Pearson相关系数计算出的聚类分析。几种复杂的多元技术基于相关矩阵及其结果数值积的分解。主成分分析(PCA)基于相关矩阵分解,已用于分析微阵列数据。但是,旋转因子分析尚未得到广泛研究。在本分析中使用了以前发表的有关42427个基因的数据,这些数据已使用聚类分析进行了分析。数据来自在三个发育阶段(受精后的第一天,第二天和第三天)分析胚胎整体微阵列基因表达的实验。先前发表的分析使用聚类分析根据基因表达按阶段/天正确分类观察结果。有关22,561个基因的数据适用于进一步的多变量分析。在本研究中,使用quartimax旋转因子分析提取了与聚类分析平行的五个因子,其中蛋和胚胎发育的天数分别加载在不同的因子上。计算每个基因在五个因子上的因子得分,并将其用于修饰的基因剃刮或SVD(奇异值分解)。这确定了负责所有五个因素方差大部分的超基因。因子评分的路径分析表明,五个基因可能是多效性或调控性基因。这一概念证明数字分析为开发微阵列数据的多元分析技术提供了基础,而聚类分析可用于评估基因表达的多效控制的因果关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号