首页> 美国卫生研究院文献>Journal of the American Medical Informatics Association : JAMIA >Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning
【2h】

Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning

机译:使用自然语言处理和机器学习有效识别国家授权的可报告癌症病例

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Objective To help cancer registrars efficiently and accurately identify reportable cancer cases.>Material and Methods The Cancer Registry Control Panel (CRCP) was developed to detect mentions of reportable cancer cases using a pipeline built on the Unstructured Information Management Architecture – Asynchronous Scaleout (UIMA-AS) architecture containing the National Library of Medicine’s UIMA MetaMap annotator as well as a variety of rule-based UIMA annotators that primarily act to filter out concepts referring to nonreportable cancers. CRCP inspects pathology reports nightly to identify pathology records containing relevant cancer concepts and combines this with diagnosis codes from the Clinical Electronic Data Warehouse to identify candidate cancer patients using supervised machine learning. Cancer mentions are highlighted in all candidate clinical notes and then sorted in CRCP’s web interface for faster validation by cancer registrars.>Results CRCP achieved an accuracy of 0.872 and detected reportable cancer cases with a precision of 0.843 and a recall of 0.848. CRCP increases throughput by 22.6% over a baseline (manual review) pathology report inspection system while achieving a higher precision and recall. Depending on registrar time constraints, CRCP can increase recall to 0.939 at the expense of precision by incorporating a data source information feature.>Conclusion CRCP demonstrates accurate results when applying natural language processing features to the problem of detecting patients with cases of reportable cancer from clinical notes. We show that implementing only a portion of cancer reporting rules in the form of regular expressions is sufficient to increase the precision, recall, and speed of the detection of reportable cancer cases when combined with off-the-shelf information extraction software and machine learning.
机译:>目的以帮助癌症注册服务商有效,准确地识别可报告的癌症病例。>材料和方法开发了癌症注册控制面板(CRCP),用于通过管道检测提及的可报告癌症病例基于非结构化信息管理架构–异步横向扩展(UIMA-AS)架构,该架构包含美国国家医学图书馆的UIMA MetaMap注释器以及各种基于规则的UIMA注释器,这些注释器主要用于过滤涉及不可报告癌症的概念。 CRCP每晚检查病理报告,以识别包含相关癌症概念的病理记录,并将其与临床电子数据仓库中的诊断代码结合起来,以使用监督机器学习来识别候选癌症患者。在所有候选临床说明中都会突出显示癌症提及,然后在CRCP的Web界面中对癌症进行分类,以便癌症注册服务商更快地进行验证。>结果 CRCP的准确度为0.872,检测到的可报告的癌症病例的准确度为0.843,召回为0.848。与基线(手动检查)病理报告检查系统相比,CRCP的吞吐率提高了22.6%,同时实现了更高的精度和召回率。根据注册服务商的时间限制,CRCP可以通过合并数据源信息功能而以精确度为代价将召回率提高到0.939。>结论 CRCP在将自然语言处理功能应用于检测患有以下疾病的患者时证明了准确的结果临床记录中可报告癌症的病例。我们显示,与现成的信息提取软件和机器学习相结合,仅以正则表达式形式实施一部分癌症报告规则就足以提高检测可报告癌症病例的准确性,召回率和速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号