Text data extraction for a prospective, research-focused data mart: implementation and validation

Monique Hinchcliff; Eric Just; Sofia Podlusky; John Varga; Rowland W Chang; Warren A Kibbe

首页> 外文期刊>BMC Medical Informatics and Decision Making >Text data extraction for a prospective, research-focused data mart: implementation and validation

【24h】

Text data extraction for a prospective, research-focused data mart: implementation and validation

机译：文本数据提取，用于预期的，以研究为中心的数据集市：实施和验证

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Translational research typically requires data abstracted from medical records as well as data collected specifically for research. Unfortunately, many data within electronic health records are represented as text that is not amenable to aggregation for analyses. We present a scalable open source SQL Server Integration Services package, called Regextractor, for including regular expression parsers into a classic extract, transform, and load workflow. We have used Regextractor to discrete data from textual reports from a number of ‘machine generated’ sources. To validate this package, we created a pulmonary function test data mart and analyzed the quality of the data mart versus manual chart review. Methods Eleven variables from pulmonary function tests performed closest to the initial clinical evaluation date were studied for 100 randomly selected subjects with scleroderma. One research assistant manually reviewed, abstracted, and entered relevant data into a database. Correlation with data obtained from the automated pulmonary function test data mart within the Northwestern Medical Enterprise Data Warehouse was determined. Results There was a near perfect (99.5%) agreement between results generated from the Regextractor package and those obtained via manual chart abstraction. The pulmonary function test data mart has been used subsequently to monitor disease progression of patients in the Northwestern Scleroderma Registry. In addition to the pulmonary function test example presented in this manuscript, the Regextractor package has been used to create cardiac catheterization and echocardiography data marts. The Regextractor package was released as open source software in October 2009 and has been downloaded 552 times as of 6/1/2012. Conclusions Collaboration between clinical researchers and biomedical informatics experts enabled the development and validation of a tool (Regextractor) to parse, and assemble structured data from text data contained in the electronic health record. Regextractor has been successfully used to create additional data marts in other medical domains and is available to the public.

机译：背景技术转化研究通常需要从病历中提取数据以及专门为研究而收集的数据。不幸的是，电子健康记录中的许多数据被表示为不适合汇总分析的文本。我们提供了一个可扩展的开源SQL Server集成服务包，称为Regextractor，用于将正则表达式解析器包含到经典的提取，转换和加载工作流程中。我们已经使用Regextractor从许多“机器生成”源的文本报告中分离数据。为了验证此软件包，我们创建了一个肺功能测试数据集市，并分析了数据集市与手动图表审查的质量。方法对随机选择的100例硬皮病患者的肺功能测试中的11个变量进行了研究，这些变量最接近最初的临床评估日期。一名研究助理手动审查，提取摘要并将相关数据输入数据库。确定了与从西北医疗企业数据仓库内的自动肺功能测试数据集市获得的数据的相关性。结果从Regextractor软件包生成的结果与通过手动图表抽象获得的结果之间几乎完美（99.5％）的一致性。肺功能测试数据集市随后被用于监视西北硬皮病登记处的患者疾病进展。除了本手稿中的肺功能测试示例外，Regextractor软件包还用于创建心脏导管插入术和超声心动图数据集市。 Regextractor软件包于2009年10月作为开源软件发布，截至2012年6月1日已下载552次。结论临床研究人员和生物医学信息学专家之间的合作使开发和验证一种工具（Regextractor）可以解析和组合电子健康记录中包含的文本数据的结构化数据。 Regextractor已成功用于在其他医学领域中创建其他数据集市，并且对公众开放。

著录项

来源
《BMC Medical Informatics and Decision Making》 |2012年第1期|共页
作者
Monique Hinchcliff; Eric Just; Sofia Podlusky; John Varga; Rowland W Chang; Warren A Kibbe;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词

相似文献

外文文献
中文文献
专利

1. Validation of the TOtal Visual acuity extraction Algorithm (TOVA) for automated extraction of visual acuity and intraocular pressure data from free text clinical records [J] . Baughman Doug, Lee Cecilia, Lee Aaron Y. Investigative ophthalmology & visual science . 2017,第8期

机译：从自由文本临床记录中验证可视敏锐度和眼内压力数据的自动提取敏锐提取算法（TOVA）
2. Validation of the TOtal Visual acuity extraction Algorithm (TOVA) for automated extraction of visual acuity and intraocular pressure data from free text clinical records [J] . Baughman Doug, Lee Cecilia, Lee Aaron Y. Investigative ophthalmology & visual science . 2017,第8期

机译：从自由文本临床记录中验证可视敏锐度和眼内压力数据的自动提取敏锐提取算法（TOVA）
3. Creating efficiencies in the extraction of data from randomized trials: a prospective evaluation of a machine learning and text mining tool [J] . Gates Allison, Gates Michelle, Sim Shannon, BMC Medical Research Methodology . 2021,第1期

机译：从随机试验中提取数据的提取效率：机器学习和文本挖掘工具的预期评估
4. Design and validation of a low resource-cost video data processing method for embedded implementation of optical flow extraction [C] . Bako Laszlo, Enachescu Calin, Brassai Sandor-Tihamer International Carpathian Control Conference . 2015

机译：用于嵌入式光流提取的低资源成本视频数据处理方法的设计和验证
5. Scaling the Technology Opportunity Analysis text data mining methodology: Data extraction, cleaning, online analytical processing analysis, and reporting of large multi-source datasets. [D] . George, Richard Peyton. 2006

机译：扩展技术机会分析文本数据挖掘方法：数据提取，清理，在线分析处理分析以及大型多源数据集的报告。
6. Text data extraction for a prospective research-focused data mart: implementation and validation [O] . Monique Hinchcliff, Eric Just, Sofia Podlusky, 2012

机译：文本数据提取用于以研究为重点的前瞻性数据集市：实施和验证
7. Text data extraction for a prospective, research-focused data mart: implementation and validation [O] . 2012

机译：文本数据提取，用于以研究为重点的前瞻性数据集市：实施和验证
8. Federal Implementation Guideline for Electronic Data Interchange. ASC X12 003060Transaction Set 517G Material Obligation Validation (Government Furnished Material Validation). Implementation Convention [R] . Favreau, J. P. 1998

机译：联邦电子数据交换实施指南。 asC X12 003060Transaction set 517G材料义务验证（政府家具材料验证）。执行公约

Text data extraction for a prospective, research-focused data mart: implementation and validation

摘要

著录项

相似文献

相关主题

期刊订阅