首页> 外文会议>International Joint Conference on Neural Networks >Localized sampling for hospital re-admission prediction with imbalanced sample distributions
【24h】

Localized sampling for hospital re-admission prediction with imbalanced sample distributions

机译:样本分布不均的局部采样用于医院再入院预测

获取原文
获取外文期刊封面目录资料

摘要

Hospital re-admission refers to special medical events that a patient previously discharged from the hospital is readmitted within a short period of time (say 30 days). A re-admission not only downgrades the quality of living of the patient, it also adds significant financial burdens to the health care systems. To date, many systems exist to use computational approaches to predict the likelihood of a patient being readmitted in the future for medical decision assistance. When building predictive models for hospital re-admission prediction, one essential challenge is that sample distributions in the data are severely imbalanced where, typically, less than 10% of patients are likely going to be readmitted in a near future. A predictive model, without considering sample imbalance, will unlikely generate accurate results for prediction. To date, no existing re-admission model has explicitly addressed such data imbalance issues in their systems. In this paper, we consider hospital re-admission prediction with imbalanced sample distributions, and propose to use localized sampling approach to help build accurate predictive models. For localized sampling, we emphasize on samples which are difficult to classify, and allow the sampling process to bias to such instances. Because finding instances difficult to classify requires calculation of distance between instances, and the high dimensionality of Electronic Health Records (EHR) data makes the distance calculation highly ineffective, we propose to use latent topic embedding to reduce the sample from high dimensionality to a handful of low dimensional topic space for effective and accurate calculation of the distance between instances. By using localized sampling to build multiple versions of balanced datasets, we are able to train multiple predictive models and combine their results for prediction. Experiments and comparisons on data collected from several South Florida regional hospitals demonstrate the performance of our method.
机译:再次入院是指特殊的医疗事件,即先前在短时间内(例如30天)重新入院的患者。重新入院不仅降低了患者的生活质量,还给医疗保健系统增加了可观的经济负担。迄今为止,存在许多系统使用计算方法来预测患者将来因医疗决策协助而被重新接纳的可能性。在建立用于医院再次入院预测的预测模型时,一个基本挑战是数据中的样本分布严重失衡,通常在不久的将来很可能会再次接纳少于10%的患者。不考虑样本不平衡的预测模型将不太可能产生准确的预测结果。迄今为止,尚无现有的重新接纳模型明确解决其系统中的此类数据不平衡问题。在本文中,我们考虑了样本分布不均的医院再入院预测,并建议使用局部采样方法来帮助建立准确的预测模型。对于局部采样,我们强调难以分类的样本,并允许采样过程偏向此类情况。由于查找难以分类的实例需要计算实例之间的距离,并且电子病历(EHR)数据的高维度使距离计算非常无效,因此我们建议使用潜在主题嵌入将样本从高维度减少到少数低维主题空间,可有效,准确地计算实例之间的距离。通过使用局部采样来构建平衡数据集的多个版本,我们能够训练多个预测模型并将其结果组合以进行预测。对从南佛罗里达州几家地区医院收集的数据进行的实验和比较证明了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号