首页> 外文期刊>BMC Bioinformatics >Comparison of pathway and gene-level models for cancer prognosis prediction
【24h】

Comparison of pathway and gene-level models for cancer prognosis prediction

机译:途径和基因级模型对癌症预测预测的比较

获取原文
           

摘要

Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biological pathways. While many studies have used pathway-level predictors for cancer survival analysis, a comprehensive comparison of pathway-level and gene-level prognostic models has not been performed. To address this gap, we characterized the performance of penalized Cox proportional hazard models built using either pathway- or gene-level predictors for the cancers profiled in The Cancer Genome Atlas (TCGA) and pathways from the Molecular Signatures Database (MSigDB). When analyzing TCGA data, we found that pathway-level models are more parsimonious, more robust, more computationally efficient and easier to interpret than gene-level models with similar predictive performance. For example, both pathway-level and gene-level models have an average Cox concordance index of ~?0.85 for the TCGA glioma cohort, however, the gene-level model has twice as many predictors on average, the predictor composition is less stable across cross-validation folds and estimation takes 40 times as long as compared to the pathway-level model. When the complex correlation structure of the data is broken by permutation, the pathway-level model has greater predictive performance while still retaining superior interpretative power, robustness, parsimony and computational efficiency relative to the gene-level models. For example, the average concordance index of the pathway-level model increases to 0.88 while the gene-level model falls to 0.56 for the TCGA glioma cohort using survival times simulated from uncorrelated gene expression data. The results of this study show that when the correlations among gene expression values are low, pathway-level analyses can yield better predictive performance, greater interpretative power, more robust models and less computational cost relative to a gene-level model. When correlations among genes are high, a pathway-level analysis provides equivalent predictive power compared to a gene-level analysis while retaining the advantages of interpretability, robustness and computational efficiency.
机译:癌症预测预测对患者和临床医生来说是有价值的,因为它允许它们适当地管理护理。提高基于表达的预测模型的性能和解释的有希望的方向涉及基因级数据的聚集成生物途径。虽然许多研究使用了用于癌症存活分析的途径级预测因子,但尚未进行途径和基因级预后模型的全面比较。为了解决这一差距,我们的特征是使用用于在癌症基因组(TCGA)(TCGA)和来自分子签名数据库(MSIGDB)中的癌症中分布的癌症的癌症或基因级预测因子来表现出惩罚的Cox比例危害模型。在分析TCGA数据时,我们发现途径级模型更加苛刻,更强大,更加稳健,比具有类似预测性能的基因级模型更容易地解释。例如,途径水平和基因级模型的平均COX齐全指数~~~.85对于TCGA胶质瘤队列,然而,基因级模型平均预测因子是两倍,预测器组合物在较小的情况下与路径级模型相比,交叉验证折叠和估计需要40倍。当数据的复杂相关结构被排列断开时,路径级模型具有更大的预测性能,同时仍然保持相对于基因级模型的卓越的解释力,鲁棒性,分析和计算效率。例如,途径模型的平均齐全指数增加到0.88,而使用由不相关的基因表达数据模拟的存活时间,基因级模型落到TCGA胶质瘤队的0.56。该研究的结果表明,当基因表达值之间的相关性低时,途径分析可以产生更好的预测性能,更高的解释性功率,更强大的模型和相对于基因级模型的计算成本较少。当基因之间的相关性很高时,与基因级分析相比,途径分析提供了等效的预测功率,同时保留了可解释性,鲁棒性和计算效率的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号