首页> 美国卫生研究院文献>other >Optimizing predictive performance of criminal recidivism models using registration data with binary and survival outcomes

【2h】

Optimizing predictive performance of criminal recidivism models using registration data with binary and survival outcomes

机译：使用具有二进制和生存结果的注册数据优化刑事累犯模型的预测性能

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

In a recidivism prediction context, there is no consensus on which modeling strategy should be followed for obtaining an optimal prediction model. In previous papers, a range of statistical and machine learning techniques were benchmarked on recidivism data with a binary outcome. However, two important tree ensemble methods, namely gradient boosting and random forests were not extensively evaluated. In this paper, we further explore the modeling potential of these techniques in the binary outcome criminal prediction context. Additionally, we explore the predictive potential of classical statistical and machine learning methods for censored time-to-event data. A range of statistical manually specified statistical and (semi-)automatic machine learning models is fitted on Dutch recidivism data, both for the binary outcome case and censored outcome case. To enhance generalizability of results, the same models are applied to two historical American data sets, the North Carolina prison data. For all datasets, (semi-) automatic modeling in the binary case seems to provide no improvement over an appropriately manually specified traditional statistical model. There is however evidence of slightly improved performance of gradient boosting in survival data. Results on the reconviction data from two sources suggest that both statistical and machine learning should be tried out for obtaining an optimal model. Even if a flexible black-box model does not improve upon the predictions of a manually specified model, it can serve as a test whether important interactions are missing or other misspecification of the model are present and can thus provide more security in the modeling process.

机译：在累犯预测上下文中，对于应采用哪种建模策略以获得最佳预测模型尚无共识。在以前的论文中，一系列统计和机器学习技术均以累犯数据为基准，并具有二进制结果。然而，两种重要的树木集成方法，即梯度增强法和随机森林法，并未得到广泛评估。在本文中，我们将进一步探讨这些技术在二元结果犯罪预测背景下的建模潜力。此外，我们探索了经典的统计和机器学习方法对事件时间数据的预测潜力。在二元结局案例和删失结局案例中，一系列的手动指定统计和（半）自动机器学习模型都适用于荷兰累犯数据。为了增强结果的通用性，将相同的模型应用于两个美国历史数据集，即北卡罗来纳州监狱数据。对于所有数据集，在二进制情况下的（半）自动建模似乎没有对手动指定的传统统计模型进行任何改进。但是，有证据显示生存数据中梯度增强的性能略有改善。来自两个来源的对流数据的结果表明，应该尝试统计和机器学习来获得最佳模型。即使灵活的黑匣子模型不能改善手动指定模型的预测，也可以用作测试是否缺少重要的交互作用或是否存在模型的其他错误规定，从而可以在建模过程中提供更大的安全性。

著录项

期刊名称 other
作者
Nikolaj Tollenaar; Peter G. M. van der Heijden;
展开▼
作者单位

展开▼
年(卷),期 -1(14),3
年度 -1
页码 e0213245
总页数 37
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Carbonic anhydrase IX, hypoxia-inducible factor-1alpha, ezrin and glucose transporter-1 as predictors of disease outcome in rectal cancer: multivariate Cox survival models following data reduction by principal component analysis of the clinicopathological predictors. [J] . Korkeila EA, Sundstrom J, Pyrhonen S, Anticancer Research: International Journal of Cancer Research and Treatment . 2011,第12期

机译：碳酸酐酶IX，缺氧诱导因子1α，ezrin和葡萄糖转运蛋白1作为直肠癌疾病预后的预测因子：通过临床病理预测因子的主成分分析减少数据后的多变量Cox生存模型。
2. Luo, S.a , Yi, M.b , Huang, X.c , Hunt, K.b A Bayesian model for misclassified binary outcomes and correlated survival data with applications to breast cancer [J] . LuoS., YiM., HuangX., Statistics in medicine . 2013,第13期

机译：Luo，S.a，Yi，M.b，Huang，X.c，Hunt，K.b贝叶斯模型用于错误分类的二元结局以及与乳腺癌相关的生存数据
3. Luo, S.a , Yi, M.b , Huang, X.c , Hunt, K.b A Bayesian model for misclassified binary outcomes and correlated survival data with applications to breast cancer [J] . LuoS., YiM., HuangX., Statistics in medicine . 2013,第13期

机译：罗，S.A，Yi，M.B，Huang，X.C，Hunt，K.B一种贝叶斯模型用于错误分类的二元成果和与乳腺癌应用相关的存活数据
4. Assessing the Predictive Performance of Survival Models with Longitudinal Data [C] . Ipek Guler, Christel Faes, Francisco Gude, International conference on computational science and its applications . 2017

机译：使用纵向数据评估生存模型的预测性能
5. Bayesian joint modeling of longitudinal visual field data with correlated binary and survival outcomes. [D] . Ledahl, Jeffrey S. 2015

机译：具有相关二进制和生存结果的纵向视野数据的贝叶斯联合建模。
6. A Bayesian Model for Misclassified Binary Outcomes and Correlated Survival Data with Applications to Breast Cancer [O] . Sheng Luo, Min Yi, Xuelin Huang, -1

机译：贝叶斯模型用于错误分类二元成果和相关的生存数据应用于乳腺癌
7. Predicting Criminal Recidivism Using "Split Population" Survival Time Models [O] . Peter Schmidt, Ann Dryden Witte 1987

机译：使用“分裂人口”生存时间模型预测犯罪累犯

Optimizing predictive performance of criminal recidivism models using registration data with binary and survival outcomes

摘要

著录项

相似文献

相关主题

期刊订阅