首页> 外文期刊>JCO clinical cancer informatics. >Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records
【24h】

Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records

机译:自然语言处理以识别具有电子病历的癌症治疗

获取原文
获取原文并翻译 | 示例
           

摘要

PURPOSE Knowing the treatments administered to patients with cancer is important for treatment planning and correlating treatment patterns with outcomes for personalized medicine study. However, existing methods to identify treatments are often lacking. We develop a natural language processing approach with structured electronic medical records and unstructured clinical notes to identify the initial treatment administered to patients with cancer.METHODS We used a total number of 4,412 patients with 483,782 clinical notes from the Stanford Cancer Institute Research Database containing patients with nonmetastatic prostate, oropharynx, and esophagus cancer. We trained treatment identification models for each cancer type separately and compared performance of using only structured, only unstructured (bag-of-words, doc2vec, fasttext), and combinations of both [structured + bow, structured + doc2vec, structured + fasttext). We optimized the identification model among five machine learning methods (logistic regression, multilayer perceptrons, random forest, support vector machines, and stochastic gradient boosting). The treatment information recorded in the cancer registry is the gold standard and compares our methods to an identification baseline with billing codes.RESULTS For prostate cancer, we achieved an fl-score of 0.99 (95% Cl, 0.97 to 1.00) for radiation and 1.00 (95% Cl, 0.99 to 1.00) for surgery using structured + doc2vec. For oropharynx cancer, we achieved an fl-score of 0.78 (95% Cl, 0.58 to 0.93) for chemoradiation and 0.83 (95% Cl, 0.69 to 0.95) for surgery using doc2vec. For esophagus cancer, we achieved an fl-score of 1.0(95% Cl, 1.0 to 1.0) for both chemoradiation and surgery using all combinations of structured and unstructured data. We found that employing the free-text clinical notes outperforms using the billing codes or only structured data for all three cancer types.CONCLUSION Our results show that treatment identification using free-text clinical notes greatly improves upon the performance using billing codes and simple structured data. The approach can be used for treatment cohort identification and adapted for longitudinal cancer treatment identification.
机译:目的了解对癌症患者进行治疗的治疗对于治疗计划和将治疗模式与个性化医学研究结果相关联很重要。但是,通常缺乏识别治疗方法的现有方法。我们使用结构化的电子病历和非结构化临床笔记开发一种自然语言处理方法,以识别针对癌症患者的初始治疗。方法我们使用了4,412例患有483,782次临床笔记的患者,该患者来自斯坦福癌症研究所研究数据库,其中含有含有患者的患者非转移性前列腺,口咽和食道癌。我们分别训练了每种癌症类型的治疗识别模型,并仅使用结构化的性能,仅使用结构化的(单词袋,doc2vec,fastText)和[结构化 + BOW,结构化 + doc2vec,结构化 + fastText)的组合。我们优化了五种机器学习方法(逻辑回归,多层感知器,随机森林,支持向量机和随机梯度提升)之间的识别模型。癌症注册表中记录的治疗信息是黄金标准,将我们的方法与计费代码进行比较。 (95%Cl,0.99至1.00)使用结构化 + doc2vec进行手术。对于口咽癌,我们实现了化学放疗的FL分数为0.78(95%Cl,0.58至0.93),使用DOC2VEC进行手术0.83(95%Cl,0.69至0.95)。对于食道癌,我们使用结构化和非结构化数据的所有组合都获得了化学放疗和手术的1.0(95%Cl,1.0至1.0)的FL分数。我们发现,采用自由文本临床笔记使用计费代码或仅针对所有三种癌症类型的结构化数据优于表现。结论我们的结果表明,使用自由文本临床注释的治疗识别可在使用计费代码和简单结构化数据的性能上大大改善绩效。 。该方法可用于治疗队列鉴定,并适用于纵向癌症治疗鉴定。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号