首页> 外文学位 >Centralized and Distributed Learning Methods for Predictive Health Analytics
【24h】

Centralized and Distributed Learning Methods for Predictive Health Analytics

机译:用于预测性健康分析的集中式和分布式学习方法

获取原文
获取原文并翻译 | 示例

摘要

The U.S. health care system is considered costly and highly inefficient, devoting substantial resources to the treatment of acute conditions in a hospital setting rather than focusing on prevention and keeping patients out of the hospital. The potential for cost savings is large; in the U.S. more than $30 billion are spent each year on hospitalizations deemed preventable, 31% of which is attributed to heart diseases and 20% to diabetes. Motivated by this, our work focuses on developing centralized and distributed learning methods to predict future heart- or diabetes- related hospitalizations based on patient Electronic Health Records (EHRs).;We explore a variety of supervised classification methods and we present a novel likelihood ratio based method (K-LRT) that predicts hospitalizations and offers interpretability by identifying the K most significant features that lead to a positive prediction for each patient. Next, assuming that the positive class consists of multiple clusters (hospitalized patients due to different reasons), while the negative class is drawn from a single cluster (non-hospitalized patients healthy in every aspect), we present an alternating optimization approach, which jointly discovers the clusters in the positive class and optimizes the classifiers that separate each positive cluster from the negative samples. We establish the convergence of the method and characterize its VC dimension. Last, we develop a decentralized cluster Primal-Dual Splitting (cPDS) method for large-scale problems, that is computationally efficient and privacy-aware. Such a distributed learning scheme is relevant for multi-institutional collaborations or peer-to-peer applications, allowing the agents to collaborate, while keeping every participant's data private. cPDS is proved to have an improved convergence rate compared to existing centralized and decentralized methods. We test all methods on real EHR data from the Boston Medical Center and compare results in terms of prediction accuracy and interpretability.
机译:美国医疗保健系统被认为是昂贵且效率低下的,其将大量资源用于医院环境中的急性疾病的治疗,而不是着重于预防和将患者拒之门外。节省成本的潜力很大;在美国,每年用于预防的住院费用超过300亿美元,其中31%与心脏病有关,而20%与糖尿病有关。因此,我们的工作重点是开发基于患者电子健康记录(EHR)的集中式和分布式学习方法,以预测未来与心脏病或糖尿病相关的住院治疗。;我们探索了多种监督分类方法,并提出了一种新颖的可能性比基于方法(K-LRT)的方法,可预测住院情况并通过识别K个最重要的特征(为每个患者带来积极的预测)来提供可解释性。接下来,假设阳性类别由多个类别(因不同原因而住院的患者)组成,而阴性类别由单个类别(各个方面都健康的非住院患者)组成,我们提出了一种交替优化方法,该方法共同在阳性类别中发现聚类,并优化将每个阳性聚类与阴性样本区分开的分类器。我们建立了该方法的收敛性,并表征了其VC维。最后,我们针对大型问题开发了一种去中心化的集群原始对偶拆分(cPDS)方法,该方法计算效率高且具有隐私意识。这种分布式学习方案与多机构协作或对等应用程序相关,允许代理进行协作,同时将每个参与者的数据保密。与现有的集中式和分散式方法相比,cPDS被证明具有更高的收敛速度。我们对来自波士顿医学中心的实际EHR数据测试了所有方法,并比较了预测准确性和可解释性方面的结果。

著录项

  • 作者

    Brisimi, Theodora S.;

  • 作者单位

    Boston University.;

  • 授予单位 Boston University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 109 p.
  • 总页数 109
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号