首页> 美国卫生研究院文献>Journal of the American Medical Informatics Association : JAMIA >Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text
【2h】

Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text

机译:子图增强非负张量因子分解(SANTF)用于为临床叙事文本建模

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Objective Extracting medical knowledge from electronic medical records requires automated approaches to combat scalability limitations and selection biases. However, existing machine learning approaches are often regarded by clinicians as black boxes. Moreover, training data for these automated approaches at often sparsely annotated at best. The authors target unsupervised learning for modeling clinical narrative text, aiming at improving both accuracy and interpretability.>Methods The authors introduce a novel framework named subgraph augmented non-negative tensor factorization (SANTF). In addition to relying on atomic features (e.g., words in clinical narrative text), SANTF automatically mines higher-order features (e.g., relations of lymphoid cells expressing antigens) from clinical narrative text by converting sentences into a graph representation and identifying important subgraphs. The authors compose a tensor using patients, higher-order features, and atomic features as its respective modes. We then apply non-negative tensor factorization to cluster patients, and simultaneously identify latent groups of higher-order features that link to patient clusters, as in clinical guidelines where a panel of immunophenotypic features and laboratory results are used to specify diagnostic criteria.>Results and Conclusion SANTF demonstrated over 10% improvement in averaged F-measure on patient clustering compared to widely used non-negative matrix factorization (NMF) and k-means clustering methods. Multiple baselines were established by modeling patient data using patient-by-features matrices with different feature configurations and then performing NMF or k-means to cluster patients. Feature analysis identified latent groups of higher-order features that lead to medical insights. We also found that the latent groups of atomic features help to better correlate the latent groups of higher-order features.
机译:>目标:从电子病历中提取医学知识需要自动方法来应对可伸缩性限制和选择偏见。但是,现有的机器学习方法通​​常被临床医生视为黑匣子。此外,这些自动化方法的训练数据充其量通常很少被注释。作者针对在临床叙事文本建模上的无监督学习,旨在提高准确性和可解释性。>方法作者介绍了一种名为子图增强非负张量因子分解(SANTF)的新颖框架。除了依赖原子特征(例如临床叙事文本中的单词)之外,SANTF还通过将句子转换为图形表示并识别重要的子图来自动挖掘临床叙事文本中的高阶特征(例如表达抗原的淋巴样细胞之间的关系)。作者使用患者,高阶特征和原子特征作为其各自的模式来构成张量。然后,我们将非负张量因子分解应用于聚类患者,并同时识别与患者聚类相关的高阶特征的潜在组,如临床指南中使用的一组免疫表型特征和实验室结果来指定诊断标准。 >结果与结论与广泛使用的非负矩阵因子分解(NMF)和k-均值聚类方法相比,SANTF在患者聚类上的平均F度量提高了10%以上。通过使用具有不同特征配置的按特征矩阵对患者数据进行建模,然后执行NMF或k均值对患者进行聚类,可以建立多个基线。特征分析确定了潜在的高阶特征组,这些潜在组会导致医学见解。我们还发现,原子特征的潜在组有助于更好地关联高阶特征的潜在组。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号