...
首页> 外文期刊>BMC Bioinformatics >Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data
【24h】

Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data

机译:优先套索:使用多OMICS数据预测临床结果的简单分层方法

获取原文
   

获取外文期刊封面封底 >>

       

摘要

The inclusion of high-dimensional omics data in prediction models has become a well-studied topic in the last decades. Although most of these methods do not account for possibly different types of variables in the set of covariates available in the same dataset, there are many such scenarios where the variables can be structured in blocks of different types, e.g., clinical, transcriptomic, and methylation data. To date, there exist a few computationally intensive approaches that make use of block structures of this kind. In this paper we present priority-Lasso, an intuitive and practical analysis strategy for building prediction models based on Lasso that takes such block structures into account. It requires the definition of a priority order of blocks of data. Lasso models are calculated successively for every block and the fitted values of every step are included as an offset in the fit of the next step. We apply priority-Lasso in different settings on an acute myeloid leukemia (AML) dataset consisting of clinical variables, cytogenetics, gene mutations and expression variables, and compare its performance on an independent validation dataset to the performance of standard Lasso models. The results show that priority-Lasso is able to keep pace with Lasso in terms of prediction accuracy. Variables of blocks with higher priorities are favored over variables of blocks with lower priority, which results in easily usable and transportable models for clinical practice.
机译:在过去几十年中包含预测模型中的高维常常数据已成为一项学习的主题。虽然这些方法中的大多数不考虑在相同数据集中可用的可用协变量集中的可能不同类型的变量,但是存在许多这样的场景,其中变量可以在不同类型的块中构建,例如,临床,转录组和甲基化数据。迄今为止,存在少数计算密集型方法,利用这种块结构。在本文中,我们呈现优先套索,这是基于套索建立预测模型的直观和实际的分析策略,以考虑这些块结构。它需要定义数据块的优先顺序。套索模型是连续计算每个块的计算,并且每个步骤的装配值都作为下一步骤的拟合作为偏移量。我们在不同的环境中应用优先级套索(AML)数据集,由临床变量,细胞遗传学,基因突变和表达变量组成,并将其在独立验证数据集上的性能进行比较,以进行标准套索模型的性能。结果表明,在预测准确性方面,优先套索能够与套索保持步伐。具有较高优先级的块的变量在具有较低优先级的块变量上受到青睐,这导致临床实践的易用和可运输的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号