...
首页> 外文期刊>BMC Bioinformatics >Combining techniques for screening and evaluating interaction terms on high-dimensional time-to-event data
【24h】

Combining techniques for screening and evaluating interaction terms on high-dimensional time-to-event data

机译:用于筛选和评估高维时间事件数据上交互项的组合技术

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Molecular data, e.g. arising from microarray technology, is often used for predicting survival probabilities of patients. For multivariate risk prediction models on such high-dimensional data, there are established techniques that combine parameter estimation and variable selection. One big challenge is to incorporate interactions into such prediction models. In this feasibility study, we present building blocks for evaluating and incorporating interactions terms in high-dimensional time-to-event settings, especially for settings in which it is computationally too expensive to check all possible interactions. Results We use a boosting technique for estimation of effects and the following building blocks for pre-selecting interactions: (1) resampling, (2) random forests and (3) orthogonalization as a data pre-processing step. In a simulation study, the strategy that uses all building blocks is able to detect true main effects and interactions with high sensitivity in different kinds of scenarios. The main challenge are interactions composed of variables that do not represent main effects, but our findings are also promising in this regard. Results on real world data illustrate that effect sizes of interactions frequently may not be large enough to improve prediction performance, even though the interactions are potentially of biological relevance. Conclusion Screening interactions through random forests is feasible and useful, when one is interested in finding relevant two-way interactions. The other building blocks also contribute considerably to an enhanced pre-selection of interactions. We determined the limits of interaction detection in terms of necessary effect sizes. Our study emphasizes the importance of making full use of existing methods in addition to establishing new ones.
机译:背景分子数据,例如由微阵列技术产生的结果通常用于预测患者的生存概率。对于此类高维数据的多元风险预测模型,已建立了将参数估计和变量选择相结合的技术。一项重大挑战是将交互作用纳入此类预测模型。在此可行性研究中,我们介绍了用于在高维事件发生时间设置中评估和合并交互作用项的构造块,尤其是对于那些检查所有可能的交互作用在计算上过于昂贵的设置。结果我们使用增强技术来评估效果,并使用以下构造块来预先选择交互:(1)重采样,(2)随机森林和(3)正交化作为数据预处理步骤。在模拟研究中,使用所有构建块的策略都可以在各种情况下以高灵敏度检测出真正的主效应和相互作用。主要挑战是由不代表主要作用的变量组成的相互作用,但我们的发现在这方面也很有希望。现实世界数据的结果表明,即使相互作用可能具有生物学相关性,相互作用的影响大小也可能常常不足以提高预测性能。结论当人们有兴趣寻找相关的双向相互作用时,通过随机森林筛选相互作用是可行和有用的。其他构建块也极大地有助于增强交互的预选择。我们根据必要的效应量确定了相互作用检测的极限。我们的研究强调了在建立新方法的同时,充分利用现有方法的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号