首页> 外文会议>Annual International Conference of the IEEE Engineering in Medicine and Biology Society >Robust Ensemble Learning to Identify Rare Disease Patients from Electronic Health Records
【24h】

Robust Ensemble Learning to Identify Rare Disease Patients from Electronic Health Records

机译:强大的整体学习能力,可从电子健康记录中识别出罕见病患者

获取原文

摘要

There is substantial interest in developing prediction models capable of identifying rare disease patients in population-scale databases such as electronic health records (EHRs). Deriving these models is challenging for many reasons, perhaps the most important being the limited number of patients with `gold standard' confirmed diagnoses from which to learn. This paper presents a new cascade learning methodology which induces accurate prediction models from noisy `silver standard' labeled data - patients provisionally labeled as positive for the target disease based on unconfirmed evidence. The algorithm combines unsupervised feature selection, supervised ensemble learning, and unsupervised ensemble clustering to enable robust learning from noisy labels. The efficacy of the approach is illustrated through a case study involving the detection of lipo-dystrophy patients in a country-scale database of EHRs. The case study demonstrates our algorithm outperforms state-of-the-art prediction techniques and can discover previously undiagnosed patients in large EHR databases.
机译:开发能够识别人口规模数据库(例如电子健康记录(EHR))中的罕见疾病患者的预测模型引起了人们的极大兴趣。出于多种原因,推导这些模型具有挑战性,也许最重要的是数量有限的具有“金标准”确诊诊断的患者可以从中学习。本文介绍了一种新的级联学习方法,该方法可从嘈杂的“银标准”标签数据中得出准确的预测模型-根据未经证实的证据,患者临时标记为目标疾病阳性。该算法结合了无监督特征选择,有监督的集成学习和无监督的集成聚类,从而能够从嘈杂的标签中进行可靠的学习。该方法的有效性通过一个案例研究得到了说明,该案例涉及在国家级EHR数据库中检测出脂肪营养不良的患者。案例研究表明,我们的算法优于最新的预测技术,并且可以在大型EHR数据库中发现以前未被诊断的患者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号