...
首页> 外文期刊>Bioinformatics >Unified methods for feature selection in large-scale genomic studies with censored survival outcomes
【24h】

Unified methods for feature selection in large-scale genomic studies with censored survival outcomes

机译:截至缩醛生存结果的大规模基因组研究中特征选择的统一方法

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous datasets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards (PH), which is unlikely to hold for each feature. When applied to genomic features exhibiting some form of non-proportional hazards (NPH), these methods could lead to an under- or over-estimation of the effects. We propose a broad array of marginal screening techniques that aid in feature ranking and selection by accommodating various forms of NPH. First, we develop an approach based on Kullback-Leibler information divergence and the Yang-Prentice model that includes methods for the PH and proportional odds (PO) models as special cases. Next, we propose R-2 measures for the PH and PO models that can be interpreted in terms of explained randomness. Lastly, we propose a generalized pseudo-R-2 index that includes PH, PO, crossing hazards and crossing odds models as special cases and can be interpreted as the percentage of separability between subjects experiencing the event and not experiencing the event according to feature measurements.
机译:动机:一是在大规模基因组学研究的主要目标是确定与时间对事件结果它提供深入了解疾病过程中预后影响的基因。随着快速发展,高通量,在过去二十年的基因组技术,科学界能够监控成千上万的基因和蛋白质造成巨大的数据集的表达水平,其中的基因组特征的数量远远大于数量较大科目。基于单变量Cox回归方法经常被用来选择与生存结果基因组特征;然而,Cox模型假定比例风险(PH),这是不太可能保持针对每个特征。当施加到表现出某种形式的非比例风险(NPH)的基因组特征,这些方法可能会导致的影响的不足或过度估计。我们以容纳各种形式的NPH提出的边际筛选技术来帮助特征排序和选择浩如烟海。首先,我们开发了基于库勒巴克 - 莱布勒信息分歧,包括pH和比例优势(PO)模型作为特殊情况的方法仰普伦蒂斯模型的方法。接下来,我们提出可以在解释的随机性来解释的PH和PO型号R-2的措施。最后,我们提出了一个广义伪R-2指标包括PH值,PO,穿越危险的路口赔率模型作为特殊情况下也可以被解释为可分的主体之间的百分比经历该事件并根据特征的测量没有经历事件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号