首页> 外文会议>IEEE International Conference on Data Science and Advanced Analytics >Improved risk predictions via sparse imputation of patient conditions in electronic medical records
【24h】

Improved risk predictions via sparse imputation of patient conditions in electronic medical records

机译:通过在电子病历中稀疏估算患者病情来改善风险预测

获取原文

摘要

Electronic Medical Records (EMR) are increasingly used for risk prediction. EMR analysis is complicated by missing entries. There are two reasons ??? the ???primary reason for admission??? is included in EMR, but the co-morbidities (other chronic diseases) are left uncoded, and, many zero values in the data are accurate, reflecting that a patient has not accessed medical facilities. A key challenge is to deal with the peculiarities of this data ??? unlike many other datasets, EMR is sparse, reflecting the fact that patients have some, but not all diseases. We propose a novel model to fill-in these missing values, and use the new representation for prediction of key hospital events. To ???fill-in??? missing values, we represent the feature-patient matrix as a product of two low rank factors, preserving the sparsity property in the product. Intuitively, the product regularization allows sparse imputation of patient conditions reflecting common comorbidities across patients. We develop a scalable optimization algorithm based on Block coordinate descent method to find an optimal solution. We evaluate the proposed framework on two real world EMR cohorts: Cancer (7000 admissions) and Acute Myocardial Infarction (2652 admissions). Our result shows that the AUC for 3 months admission prediction is improved significantly from (0.741 to 0.786) for Cancer data and (0.678 to 0.724) for AMI data. We also extend the proposed method to a supervised model for predicting of multiple related risk outcomes (e.g. emergency presentations and admissions in hospital over 3, 6 and 12 months period) in an integrated framework. For this model, the AUC averaged over outcomes is improved significantly from (0.768 to 0.806) for Cancer data and (0.685 to 0.748) for AMI data.
机译:电子医疗记录(EMR)越来越多地用于风险预测。 EMR分析因缺失的条目而复杂。有两个原因??? ???入学原因???包括在EMR中,但持续的患者(其他慢性疾病)未编码,并且数据中的许多零值是准确的,反映患者没有访问医疗设施。一个关键的挑战是处理这个数据的特殊性???与许多其他数据集不同,EMR稀疏,反映了患者有一些但不是所有疾病的事实。我们提出了一种新型模型来填补这些缺失的值,并利用新的表示来预测关键医院事件。填写???缺少值,我们代表特征患者矩阵作为两个低秩因子的产物,在产品中保留稀疏性。直观地,产品正则化允许对患者反映患者的共同可用性的患者条件的稀疏归荷。我们开发基于块坐标序列方法的可扩展优化算法,以找到最佳解决方案。我们评估了两个现实世界EMR队列的拟议框架:癌症(7000次录取)和急性心肌梗死(2652次入院)。我们的结果表明,3个月入院预测的AUC从癌症数据(0.678至0.724)显着改善了AMI数据的(0.678至0.724)。我们还将提议的方法扩展到监督模型,以预测综合框架中的多种相关风险结果(例如,在3,6和6和12个月内的医院中的应急介绍和入学)。对于该模型,对结果的癌症平均的AUC从(0.768〜0.806)上改善了癌症数据和(0.685至0.748)的AMI数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号