Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository

Hassanpour Saeed; Langlotz Curtis P.

首页> 外文期刊>Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology >Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository

【24h】

Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository

机译：大型免费文本放射学报告资料库中的无监督主题建模

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Radiology report narrative contains a large amount of information about the patient's health and the radiologist's interpretation of medical findings. Most of this critical information is entered in free text format, even when structured radiology report templates are used. The radiology report narrative varies in use of terminology and language among different radiologists and organizations. The free text format and the subtlety and variations of natural language hinder the extraction of reusable information from radiology reports for decision support, quality improvement, and biomedical research. Therefore, as the first step to organize and extract the information content in a large multi-institutional free text radiology report repository, we have designed and developed an unsupervised machine learning approach to capture the main concepts in a radiology report repository and partition the reports based on their main foci. In this approach, radiology reports are modeled in a vector space and compared to each other through a cosine similarity measure. This similarity is used to cluster radiology reports and identify the repository's underlying topics. We applied our approach on a repository of 1,899,482 radiology reports from three major healthcare organizations. Our method identified 19 major radiology report topics in the repository and clustered the reports accordingly to these topics. Our results are verified by a domain expert radiologist and successfully explain the repository's primary topics and extract the corresponding reports. The results of our system provide a target-based corpus and framework for information extraction and retrieval systems for radiology reports.

机译：放射学报告叙述包含有关患者健康和放射科医生对医学发现的解释的大量信息。即使使用结构化放射学报告模板，大多数关键信息也以自由文本格式输入。放射学报告的叙述在不同放射学家和组织之间在术语和语言使用上有所不同。自由文本格式以及自然语言的微妙和变化阻碍了放射学报告为决策支持，质量改进和生物医学研究而提取可重复使用的信息。因此，作为在大型的多机构自由文本放射学报告库中组织和提取信息内容的第一步，我们设计并开发了一种无监督的机器学习方法，以捕获放射学报告库中的主要概念并对报告进行分区在他们的主要焦点上。在这种方法中，放射学报告在向量空间中建模，并通过余弦相似性度量相互比较。这种相似性用于对放射学报告进行聚类并标识存储库的基础主题。我们在来自三个主要医疗组织的1,899,482份放射学报告的资料库中应用了我们的方法。我们的方法在存储库中确定了19个主要的放射学报告主题，并根据这些主题对报告进行了聚类。我们的结果得到了领域专家放射科医生的验证，并成功地解释了存储库的主要主题并提取了相应的报告。我们系统的结果为放射学报告的信息提取和检索系统提供了基于目标的语料库和框架。

著录项

来源
《Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology 》 |2016年第1期| 共4页
作者
Hassanpour Saeed; Langlotz Curtis P.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类放射医学 ;
关键词
Topicmodeling; Radiology report narrative; Clustering; Text mining; Natural language processing;

机译：主题建模;放射学报告叙事;聚类;文本挖掘;自然语言处理;

相似文献

外文文献
中文文献
专利

1. Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository [J] . Hassanpour Saeed, Langlotz Curtis P. Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology . 2016 ,第1期

机译：大型免费文本放射学报告资料库中的无监督主题建模
2. Content analysis of reporting templates and free-text radiology reports [J] . HongY., KahnC.E. Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology . 2013 ,第5期

机译：报告模板和自由文本放射学报告的内容分析
3. Content Analysis of Reporting Templates and Free-Text Radiology Reports [J] . Yi Hong, Charles E. Kahn Jr. Journal of Digital Imaging . 2013 ,第5期

机译：报告模板和自由文本放射学报告的内容分析
4. Classifying Measurements in Dictated, Free-Text Radiology Reports [C] . Merlijn Sevenster Conference on artificial intelligence in medicine . 2013

机译：在专用的自由文本放射学报告中对测量结果进行分类
5. Discovering interpretable topics in free-style text: Diagnostics, rare topics, and topic supervision. [D] . Zheng, Ning. 2008

机译：在自由样式文本中发现可解释的主题：诊断，罕见主题和主题监督。
6. Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository [O] . Saeed Hassanpour, Curtis P. Langlotz 2016

机译：大型自由文本放射学报告资料库中的无监督主题建模
7. Topic models over text streams: a study of batch and online unsupervised learning [O] . Arindam Banerjee, Sugato Basu 2012

机译：文本流上的主题模型：批量和在线无监督学习的研究

Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository

摘要

著录项

相似文献

相关主题

期刊订阅