Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods

Ju Cheng; Combs Mary; Lendle Samuel D.; Franklin Jessica M.; Wyss Richard; Schneeweiss Sebastian; van der Laan Mark J.

首页> 外文期刊>Journal of applied statistics >Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods

【24h】

Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods

机译：使用超级学习者和高维倾向得分方法的电子医疗数据库倾向得分预测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a 'library' of candidate prediction models. While SL has been widely studied in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of SL in its ability to predict the propensity score (PS), the conditional probability of treatment assignment given baseline covariates, using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also proposed a novel strategy for prediction modeling that combines SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases.

机译：预测建模的最佳学习者根据基础数据生成分布而变化。超级学习者（SL）是一种通用的集成学习算法，它使用交叉验证在候选预测模型的“库”中进行选择。虽然SL已在许多环境中进行了广泛研究，但尚未在大型流行的药典流行病学和比较有效性研究中广泛使用的电子医疗数据库中对其进行彻底评估。在这项研究中，我们使用三个电子医疗数据库，应用了SL的性能并评估了SL在预测倾向评分（PS），给定基线协变量的条件下进行治疗的条件概率的能力。我们考虑了一个由非参数模型和参数模型组成的算法库。我们还提出了一种将SL与高维倾向得分（hdPS）变量选择算法相结合的预测建模新策略。使用以下三个指标评估了预测性能：对数可能性为负数，曲线下面积（AUC）和时间复杂度。结果表明，就预测性能而言，最佳的个体算法在数据集中有所不同。 SL能够适应给定的数据集并相对于任何单个学习者优化预测性能。 SL与hdPS的组合是最一致的预测方法，对于电子医疗数据库中的PS估计和预测建模可能很有希望。

著录项

来源
《Journal of applied statistics》 |2019年第12期|2216-2236|共21页
作者
Ju Cheng; Combs Mary; Lendle Samuel D.; Franklin Jessica M.; Wyss Richard; Schneeweiss Sebastian; van der Laan Mark J.;
展开▼
作者单位

Univ Calif Berkeley, Div Biostat, Berkeley, CA 94720 USA;

Univ Calif Berkeley, Div Biostat, Berkeley, CA 94720 USA;

Univ Calif Berkeley, Div Biostat, Berkeley, CA 94720 USA;

Brigham & Womens Hosp, Dept Med, Div Pharmacoepidemiol & Pharmacoecon, 75 Francis St, Boston, MA 02115 USA|Harvard Med Sch, Boston, MA 02115 USA;

Brigham & Womens Hosp, Dept Med, Div Pharmacoepidemiol & Pharmacoecon, 75 Francis St, Boston, MA 02115 USA|Harvard Med Sch, Boston, MA 02115 USA;

Brigham & Womens Hosp, Dept Med, Div Pharmacoepidemiol & Pharmacoecon, 75 Francis St, Boston, MA 02115 USA|Harvard Med Sch, Boston, MA 02115 USA;

Univ Calif Berkeley, Div Biostat, Berkeley, CA 94720 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Machine learning; ensemble learning; propensity score; observational study; electronic healthcare database;

机译：机器学习;整体学习;倾向得分;观察性研究;电子医疗数据库;

相似文献

外文文献
中文文献
专利

1. Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods [J] . Ju Cheng, Combs Mary, Lendle Samuel D., Journal of applied statistics . 2019,第9a12期

机译：使用超学习者和高维倾向评分方法对电子医疗数据库的倾向评分预测
2. Using Super Learner Prediction Modeling to Improve High-dimensional Propensity Score Estimation [J] . Wyss Richard, Schneeweiss Sebastian, van der Laan Mark, Epidemiology . 2018,第1期

机译：使用超学习预测模型来提高高维倾向评分估计
3. Head to head comparison of the propensity score and the high-dimensional propensity score matching methods [J] . Jason R. Guertin, Elham Rahme, Colin R. Dormuth, BMC Medical Research Methodology . 2016,第1期

机译：倾向得分与高维倾向得分匹配方法的正面对比
4. The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching [C] . Szabolcs SZEKER, Agnes VATHY-FOGARASSY eHealth Conference . 2018

机译：基于逻辑回归基于倾向匹配的逻辑回归倾向匹配对二分结节预测的不确定性的影响
5. The Effect of Different Relative Logistic Regression Generated Propensity Score Distributions on the Performance of Propensity Score Methods [D] . An, Ji . 2020

机译：不同相对逻辑回归产生的倾向分数分布对倾向分数方法的性能
6. Head to head comparison of the propensity score and the high-dimensional propensity score matching methods [O] . Jason R. Guertin, Elham Rahme, Colin R. Dormuth, 2016

机译：倾向得分与高维倾向得分匹配方法的正面对比
7. Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods [O] . Cheng Ju, Mary Combs, Samuel D. Lendle, 2019

机译：使用超学习者和高维倾向评分方法对电子医疗数据库的倾向评分预测

Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods

摘要

著录项

相似文献

相关主题

期刊订阅