首页> 外文期刊>Seminars in Arthritis and Rheumatism >Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing.
【24h】

Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing.

机译:使用自然语言处理在电子病历中对银屑病关节炎诊断的验证。

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

OBJECTIVES: To test whether data extracted from full text patient visit notes from an electronic medical record would improve the classification of psoriatic arthritis (PsA) compared with an algorithm based on codified data. METHODS: From the >1,350,000 adults in a large academic electronic medical record, all 2318 patients with a billing code for PsA were extracted and 550 were randomly selected for chart review and algorithm training. Using codified data and phrases extracted from narrative data using natural language processing, 31 predictors were extracted and 3 random forest algorithms were trained using coded, narrative, and combined predictors. The receiver operator curve was used to identify the optimal algorithm and a cut-point was chosen to achieve the maximum sensitivity possible at a 90% positive predictive value (PPV). The algorithm was then used to classify the remaining 1768 charts and finally validated in a random sample of 300 cases predicted to have PsA. RESULTS: The PPV of a single PsA code was 57% (95% CI 55%-58%). Using a combination of coded data and natural language processing (NLP), the random forest algorithm reached a PPV of 90% (95% CI 86%-93%) at a sensitivity of 87% (95% CI 83%-91%) in the training data. The PPV was 93% (95% CI 89%-96%) in the validation set. Adding NLP predictors to codified data increased the area under the receiver operator curve (P < 0.001). CONCLUSIONS: Using NLP with text notes from electronic medical records improved the performance of the prediction algorithm significantly. Random forests were a useful tool to accurately classify psoriatic arthritis cases to enable epidemiological research.
机译:目的:与基于编码数据的算法相比,测试从电子病历的全文患者就诊记录中提取的数据是否可以改善银屑病关节炎(PsA)的分类。方法:从大型学术电子病历中的> 1,350,000名成年人中,提取所有2318名患者的PsA计费代码,并随机选择550名患者进行图表审查和算法训练。使用经过自然语言处理从叙事数据中提取的编码数据和短语,提取了31个预测变量,并使用编码,叙事和组合预测变量训练了3种随机森林算法。使用接收器操作员曲线来确定最佳算法,并选择一个切入点以在90%的阳性预测值(PPV)时获得最大的灵敏度。然后使用该算法对剩余的1768个图表进行分类,并最终在300个预测具有PsA的病例的随机样本中进行了验证。结果:单个PsA码的PPV为57%(95%CI 55%-58%)。通过结合使用编码数据和自然语言处理(NLP),随机森林算法的PPV达到90%(95%CI 86%-93%),灵敏度为87%(95%CI 83%-91%)在训练数据中。在验证集中,PPV为93%(95%CI 89%-96%)。将NLP预测变量添加到已编码数据中会增加接收方算子曲线下的面积(P <0.001)。结论:将NLP与电子病历中的文本注释一起使用可显着提高预测算法的性能。随机森林是准确分类银屑病关节炎病例以进行流行病学研究的有用工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号