首页> 美国卫生研究院文献>other >Finding Factors Influencing Risk: Comparing Variable Selection Methods Applied to Logistic Regression Models of Cases and Controls
【2h】

Finding Factors Influencing Risk: Comparing Variable Selection Methods Applied to Logistic Regression Models of Cases and Controls

机译:查找影响风险的因素:比较变量选择方法应用于案例和控制的Logistic回归模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

When modeling the risk of a disease, the very act of selecting the factors to include can heavily impact the results. This study compares the performance of several variable selection techniques applied to logistic regression. We performed realistic simulation studies to compare five methods of variable selection: (1) a confidence interval approach for significant coefficients (CI), (2) backward selection, (3) forward selection, (4) stepwise selection, and (5) Bayesian stochastic search variable selection (SSVS) using both informed and uniformed priors. We defined our simulated diseases mimicking odds ratios for cancer risk found in the literature for environmental factors, such as smoking; dietary risk factors, such as fiber; genetic risk factors such as XPD; and interactions. We modeled the distribution of our covariates, including correlation, after the reported empirical distributions of these risk factors. We also used a null data set to calibrate the priors of the Bayesian method and evaluate its sensitivity. Of the standard methods (95% CI, backward, forward and stepwise selection) the CI approach resulted in the highest average percent of correct associations and lowest average percent of incorrect associations. SSVS with an informed prior had higher average percent of correct associations and lower average percent of incorrect associations than did the CI approach. This study shows that Bayesian methods offer a way to use prior information to both increase power and decrease false-positive results when selecting factors to model complex disease risk.
机译:在对疾病风险进行建模时,选择要包括的因素的行为会严重影响结果。本研究比较了应用于逻辑回归的几种变量选择技术的性能。我们进行了现实的仿真研究,以比较五种变量选择方法:(1)有效系数(CI)的置信区间方法;(2)后向选择;(3)前向选择;(4)逐步选择;以及(5)贝叶斯方法使用知情和统一先验的随机搜索变量选择(SSVS)。我们定义了模拟疾病,以模仿文献中针对环境因素(例如吸烟)患癌症风险的比值比。饮食风险因素,例如纤维;遗传风险因素,例如XPD;和互动。在报告了这些风险因素的经验分布之后,我们对包括相关性在内的协变量分布进行了建模。我们还使用了一个空数据集来校准贝叶斯方法的先验并评估其灵敏度。在标准方法(95%CI,向后,向前和逐步选择)中,CI方法导致正确关联的平均百分比最高,而错误关联的平均百分比最低。具有先验知识的SSVS与CI方法相比,正确关联的平均百分比更高,而错误关联的平均百分比更低。这项研究表明,贝叶斯方法为选择复杂疾病风险的建模因素提供了一种利用先验信息来增加功效和减少假阳性结果的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号