首页> 外文期刊>Inflammatory bowel diseases >Improving case definition of Crohn's disease and ulcerative colitis in electronic medical records using natural language processing: A novel informatics approach
【24h】

Improving case definition of Crohn's disease and ulcerative colitis in electronic medical records using natural language processing: A novel informatics approach

机译:使用自然语言处理改善电子病历中克罗恩病和溃疡性结肠炎的病例定义:一种新颖的信息学方法

获取原文
获取原文并翻译 | 示例
       

摘要

Background: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record-based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. Methods: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn's disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease. We used codified (i.e., International Classification of Diseases, 9th edition codes, electronic prescriptions) and narrative data from clinical notes to develop our classification model. Model development and validation was performed in a training set of 600 randomly selected patients for each disease with medical record review as the gold standard. Logistic regression with the adaptive LASSO penalty was used to select informative variables. Results: We confirmed 399 CD cases (67%) in the CD training set and 378 UC cases (63%) in the UC training set. For both, a combined model including narrative and codified data had better accuracy (area under the curve for CD 0.95; UC 0.94) than models using only disease International Classification of Diseases, 9th edition codes (area under the curve 0.89 for CD; 0.86 for UC). Addition of natural language processing narrative terms to our final model resulted in classification of 6% to 12% more subjects with the same accuracy. Conclusions: Inclusion of narrative concepts identified using natural language processing improves the accuracy of electronic medical records case definition for CD and UC while simultaneously identifying more subjects compared with models using codified data alone.
机译:背景:先前的研究使用行政法规对炎症性肠病患者进行识别的结果不一致。我们的目标是利用自然语言处理技术,结合临床数据注释中的编码数据和信息,开发出基于健壮的电子病历的炎症性肠疾病分类模型。方法:我们使用2个大型学术中心的电子病历,创建了克罗恩病(CD)和溃疡性结肠炎(UC)的数据集市,其中包括≥1国际疾病分类(第9版)的患者,每种疾病的代码。我们使用经过整理的代码(即《国际疾病分类》,第9版代码,电子处方)和临床笔记中的叙述性数据来开发我们的分类模型。在针对每种疾病的600名随机选择患者的训练集中进行了模型开发和验证,并以病历审查作为金标准。用自适应LASSO罚分进行Logistic回归来选择信息量。结果:我们在CD训练集中确认了399例CD病例(67%),在UC训练集中确认了378 UC病例(63%)。对于这两种方法,包括叙述性数据和编码数据的组合模型的准确性(CD 0.95曲线下的面积; UC 0.94)要比仅使用疾病国际分类法第9版代码(CD曲线下的面积0.89; CD曲线下的面积0.86)更好。 UC)。在我们的最终模型中增加自然语言处理叙事术语,可以使相同精度的主题分类增加6%至12%。结论:包含使用自然语言处理识别的叙述概念可以提高CD和UC电子病历定义的准确性,同时与仅使用编码数据的模型相比,可以识别更多的主题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号