首页> 外文期刊>JCO clinical cancer informatics. >Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer
【24h】

Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer

机译:自动化的NLP提取临床基本原理,用于治疗乳腺癌的治疗

获取原文
获取原文并翻译 | 示例
           

摘要

PURPOSE Key oncology end points are not routinely encoded into electronic medical records (EMRs). We assessed whether natural language processing (NLP) can abstract treatment discontinuation rationale from unstructured EMR notes to estimate toxicity incidence and progression-free survival (PFS).METHODS We constructed a retrospective cohort of 6,115 patients with early-stage and 701 patients with metastatic breast cancer initiating care at Memorial Sloan Kettering Cancer Center from 2008 to 2019. Each cohort was divided into training (70%), validation (15%), and test (15%) subsets. Human abstractors identified the clinical rationale associated with treatment discontinuation events. Concatenated EMR notes were used to train high-dimensional logistic regression and convolutional neural network models. Kaplan-Meier analyses were used to compare toxicity incidence and PFS estimated by our NLP models to estimates generated by manual labeling and time-to-treatment discontinuation (TTD).RESULTS Our best high-dimensional logistic regression models identified toxicity events in early-stage patients with an area under the curve of the receiver-operator characteristic of 0.857 ± 0.014 (standard deviation) and progression events in metastatic patients with an area under the curve of 0.752 ± 0.027 (standard deviation). NLP-extracted toxicity incidence and PFS curves were not significantly different from manually extracted curves (P = .95 and P = .67, respectively). By contrast, TTD overestimated toxicity in early-stage patients (P< .001) and underestimated PFS in metastatic patients (P< .001). Additionally, we tested an extrapolation approach in which 20% of the metastatic cohort were labeled manually, and NLP algorithms were used to abstract the remaining 80%. This extrapolated outcomes approach resolved PFS differences between receptor subtypes (P < .001 for hormone receptor+/human epidermal growth factor receptor 2- vhuman epidermal growth factor receptor 2+ v triple-negative) that could not be resolved with TTD.CONCLUSION NLP models are capable of abstracting treatment discontinuation rationale with minimal manual labeling.
机译:目的关键肿瘤学终点未常规编码为电子病历(EMRS)。我们评估了自然语言处理(NLP)是否可以从非结构化的EMR注释中抽象治疗终止原理,从而估算毒性发生率和无进展生存率(PFS)。方法我们构建了6,115例早期患者和701例乳房转移性乳房乳房的回顾群体。从2008年到2019年,在纪念斯隆·凯特林癌症中心进行护理的癌症。每个队列分为培训(70%),验证(15%)和测试(15%)子集。人类摘要者确定了与治疗停用事件相关的临床原理。串联的EMR注释用于训练高维逻辑回归和卷积神经网络模型。 Kaplan-Meier分析用于比较我们的NLP模型估计的毒性发生率和PFS与手动标签和停用时间停止产生的估计值(TTD)。回升我们最好的高维Logistic Repressions识别早期毒性事件的最佳高维Logistic Recession在接收器操作员特征曲线下面积为0.857±0.014(标准偏差)的患者(标准偏差)和曲线面积为0.752±0.027(标准偏差)的转移性患者的进展事件。 NLP提取的毒性发生率和PFS曲线与手动提取的曲线没有显着差异(分别为p = .95和p = .67)。相比之下,TTD高估了早期患者的毒性(P <.001)和转移性患者低估的PFS(P <.001)。此外,我们测试了一种外推方法,其中20%的转移队列被手动标记,并使用NLP算法来抽象剩余的80%。这种推断的结果方法可以解决受体亚型之间的PFS差异(激素受体+/人类表皮生长因子受体受体2- vhuman表皮生长因子受体2+ v triple-v triple-ngative)无法通过TTD.Conclusion NLP模型解析能够通过最小的手动标记来抽象治疗停止理由。

著录项

相似文献

  • 外文文献
  • 中文文献
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号