Localized sampling for hospital re-admission prediction with imbalanced sample distributions

机译：样本分布不均的局部采样用于医院再入院预测

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Hospital re-admission refers to special medical events that a patient previously discharged from the hospital is readmitted within a short period of time (say 30 days). A re-admission not only downgrades the quality of living of the patient, it also adds significant financial burdens to the health care systems. To date, many systems exist to use computational approaches to predict the likelihood of a patient being readmitted in the future for medical decision assistance. When building predictive models for hospital re-admission prediction, one essential challenge is that sample distributions in the data are severely imbalanced where, typically, less than 10% of patients are likely going to be readmitted in a near future. A predictive model, without considering sample imbalance, will unlikely generate accurate results for prediction. To date, no existing re-admission model has explicitly addressed such data imbalance issues in their systems. In this paper, we consider hospital re-admission prediction with imbalanced sample distributions, and propose to use localized sampling approach to help build accurate predictive models. For localized sampling, we emphasize on samples which are difficult to classify, and allow the sampling process to bias to such instances. Because finding instances difficult to classify requires calculation of distance between instances, and the high dimensionality of Electronic Health Records (EHR) data makes the distance calculation highly ineffective, we propose to use latent topic embedding to reduce the sample from high dimensionality to a handful of low dimensional topic space for effective and accurate calculation of the distance between instances. By using localized sampling to build multiple versions of balanced datasets, we are able to train multiple predictive models and combine their results for prediction. Experiments and comparisons on data collected from several South Florida regional hospitals demonstrate the performance of our method.

机译：再次入院是指特殊的医疗事件，即先前在短时间内（例如30天）重新入院的患者。重新入院不仅降低了患者的生活质量，还给医疗保健系统增加了可观的经济负担。迄今为止，存在许多系统使用计算方法来预测患者将来因医疗决策协助而被重新接纳的可能性。在建立用于医院再次入院预测的预测模型时，一个基本挑战是数据中的样本分布严重失衡，通常在不久的将来很可能会再次接纳少于10％的患者。不考虑样本不平衡的预测模型将不太可能产生准确的预测结果。迄今为止，尚无现有的重新接纳模型明确解决其系统中的此类数据不平衡问题。在本文中，我们考虑了样本分布不均的医院再入院预测，并建议使用局部采样方法来帮助建立准确的预测模型。对于局部采样，我们强调难以分类的样本，并允许采样过程偏向此类情况。由于查找难以分类的实例需要计算实例之间的距离，并且电子病历（EHR）数据的高维度使距离计算非常无效，因此我们建议使用潜在主题嵌入将样本从高维度减少到少数低维主题空间，可有效，准确地计算实例之间的距离。通过使用局部采样来构建平衡数据集的多个版本，我们能够训练多个预测模型并将其结果组合以进行预测。对从南佛罗里达州几家地区医院收集的数据进行的实验和比较证明了我们方法的有效性。

著录项

来源
《International Joint Conference on Neural Networks》|2017年|4571-4578|共8页
会议地点
作者
Xingquan Zhu; Jose Hurtado; Haicheng Tao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Hospitals; Data models; Predictive models; Biological system modeling; Resource management; Computer science; Electronic mail;

机译：医院;数据模型;预测模型;生物系统建模;资源管理;计算机科学;电子邮件;

相似文献

外文文献
中文文献
专利

1. Commercial Vehicle Activity Prediction With Imbalanced Class Distribution Using a Hybrid Sampling and Gradient Boosting Approach [J] . Low Raymond, Cheah Lynette, You Linlin IEEE Transactions on Intelligent Transportation Systems . 2021,第3期

机译：使用混合采样和梯度升压方法的商用车辆活动预测阶级分布
2. Software defect prediction with imbalanced distribution by radius-synthetic minority over-sampling technique [J] . Shikai Guo, Jian Dong, Hui Li, Journal of software: evolution and process . 2021,第7期

机译：半径合成少数群体过采样技术对软件缺陷预测
3. Cluster-based Under-sampling Approaches For Imbalanced Data Distributions [J] . Show-Jane Yen, Yue-Shi Lee Expert systems with applications . 2009,第3p1期

机译：基于群集的数据采样不均衡的欠采样方法
4. Localized sampling for hospital re-admission prediction with imbalanced sample distributions [C] . Xingquan Zhu, Jose Hurtado, Haicheng Tao International Joint Conference on Neural Networks . 2017

机译：具有不平衡样本分布的医院重新入场预测的本地化抽样
5. Photon event distribution sampling (PEDS): Image formation & high precision particle localization in scanning microscopy [D] . Larkin, Joshua D. 2009

机译：光子事件分布采样（PEDS）：扫描显微镜中的图像形成和高精度粒子定位
6. Quantifying the degree of bias from using county‐scale data in species distribution modeling: Can increasing sample size or using county‐averaged environmental data reduce distributional overprediction? [O] . Steven D. Collins, John C. Abbott, Nancy E. McIntyre 2017

机译：通过在物种分布建模中使用县级数据来量化偏差程度：增加样本量或使用县级平均环境数据可以减少分布过高的预测吗？
7. Two sample Bayesian prediction intervals for order statistics based on the inverse exponential-type distributions using right censored sample [O] . M.M. Mohie El-Din, Y. Abdel-Aty, A.R. Shafay 2011

机译：基于使用右删失样本的逆指数型分布的两个样本贝叶斯预测区间用于顺序统计

Localized sampling for hospital re-admission prediction with imbalanced sample distributions

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅