首页> 外文会议>Symposium on the interface >Assessing Patient Survival Using Microarray Gene Expression Data Via Partial Least Squares Proportional Hazard Regression
【24h】

Assessing Patient Survival Using Microarray Gene Expression Data Via Partial Least Squares Proportional Hazard Regression

机译:通过部分最小二乘比例危险回归评估使用微阵列基因表达数据的患者存活

获取原文

摘要

High dimensional data sets from microarray experiments where the number of variables (genes) p far exceed the number of samples N render most traditional statistical tools of little direct use. However, some of these statistical tools when used in conjunction with an appropriate dimension reduction method can be effective. In this paper we introduce the use the proportional hazard (PH) regression (Cox 1972) in conjunction with dimension reduction by partial least squares (PLS), since the number of covariates p exceeds the number of samples N. This setting is typical of gene expression data from DNA microarrays. Specifically, for a given vector of response values which are times to event (death or censored times) and p gene expressions (covariates) we address the issue of how to assess (estimate) the survival experience (curve) when N p. The approach taken to cope with the high dimensionality is to reduce the dimension via some dimension reduction (component extraction) method in the first stage and then estimate the survival distribution using a PH regression model in the second stage. The primary methods of component extraction considered is PLS. PLS achieves dimension reduction by constructing components to maximize the covariance between he response (survival times) and the linear combination of the covariates (gene expressions) sequentially. This is analogous to principal components analysis (PCA) but the optimization criterion in PCA is variance rather than covariance in PLS. We demonstrate the use of the methodology to a diffuse large B-cell lymphoma (DLBCL) complementary DNA (cDNA) data set.
机译:来自微阵列实验的高尺寸数据集,其中变量(基因)p的数量远远超过样本数n,呈现最传统的统计工具几乎直接使用。然而,一些这些统计工具与适当的尺寸减少方法结合使用时可以有效。在本文中,我们将使用比例危害(pH)回归(Cox 1972)结合通过局部最小二乘(PL)的尺寸减少,因为协变量P的数量超过样品N.该设置是基因的典型来自DNA微阵列的表达数据。具体地,对于给定的响应载体载体是事件(死亡或审查时间)和P基因表达(协变量)的响应值载体,我们解决了如何评估(估计)生存经验(曲线)时的问题(估计)。采用高维度的方法是通过第一阶段中的一些尺寸还原(组分萃取)方法来减少尺寸,然后在第二阶段中使用pH回归模型估计生存分布。考虑的组分提取的主要方法是PLS。通过构建组分来实现尺寸减小,以最大化他的反应(存活时间)与顺序协变量(基因表达)的线性组合之间的协方差。这类似于主成分分析(PCA),但PCA中的优化标准是方差而不是PLS中的协方差。我们证明了方法的使用到弥漫性大B细胞淋巴瘤(DLBCL)互补DNA(cDNA)数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号