...
首页> 外文期刊>SIGKDD explorations >Marble: High-throughput Phenotyping from Electronic Health Records via Sparse Nonnegative Tensor Factorization
【24h】

Marble: High-throughput Phenotyping from Electronic Health Records via Sparse Nonnegative Tensor Factorization

机译:大理石:通过稀疏非负张量因子分解从电子病历中获得高通量表型

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The rapidly increasing availability of electronic health records (EHRs) from multiple heterogeneous sources has spearheaded the adoption of data-driven approaches for improved clinical research, decision making, prognosis, and patient management. Unfortunately, EHR data do not always directly and reliably map to phenotypes, or medical concepts, that clinical researchers need or use. Existing phenotyping approaches typically require labor intensive supervision from medical experts. We propose Marble, a novel sparse non-negative tensor factorization method to derive phenotype candidates with virtually no human supervision. Marble decomposes the observed tensor into two terms, a bias tensor and an interaction tensor. The bias tensor represents the baseline characteristics common amongst the overall population and the interaction tensor defines the phenotypes. We demonstrate the capability of our proposed model on both simulated and patient data from a publicly available clinical database. Our results show that Marble derived phenotypes provide at least a 42.8% reduction in the number of nonzero element and also retains predictive power for classification purposes. Furthermore, the resulting phenotypes and baseline characteristics from real EHR data are consistent with known characteristics of the patient population. Thus it can potentially be used to rapidly characterize, predict, and manage a large number of diseases, thereby promising a novel, data-driven solution that can benefit very large segments of the population.
机译:来自多种不同来源的电子健康记录(EHR)的迅速增加,带动了采用数据驱动的方法来改善临床研究,决策,预后和患者管理。不幸的是,EHR数据并不总是直接可靠地映射到临床研究人员需要或使用的表型或医学概念。现有的表型方法通常需要医学专家的劳动密集型监督。我们提出大理石,一种新颖的稀疏非负张量因子分解方法,以在没有人工监督的情况下得出候选表型。 Marble将观察到的张量分解为两个项,即偏置张量和交互张量。偏差张量表示总体群体中共有的基线特征,而交互张量定义了表型。我们证明了我们提出的模型对来自公开临床数据库的模拟和患者数据的功能。我们的研究结果表明,大理石衍生的表型至少减少了42.8%的非零元素数量,并且保留了用于分类目的的预测能力。此外,从实际EHR数据得出的表型和基线特征与患者群体的已知特征一致。因此,它可以潜在地用于快速表征,预测和管理大量疾病,从而有望提供一种新颖的,由数据驱动的解决方案,该方案可以使很大一部分人口受益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号