首页> 外文OA文献 >Automatic Population of Structured Reports from Narrative Pathology Reports
【2h】

Automatic Population of Structured Reports from Narrative Pathology Reports

机译:从叙事病理学报告自动填充结构化报告

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

There are a number of advantages for the use of structured pathology reports: they can ensure the accuracy and completeness of pathology reporting; it is easier for the referring doctors to glean pertinent information from them.ududThe goal of this thesis is to extract pertinent information from free-text pathology reports and automatically populate structured reports for cancer diseases and identify the commonalities and differences in processing principles to obtain maximum accuracy.ududThree pathology corpora were annotated with entities and relationships between the entities in this study, namely the melanoma corpus, the colorectal cancer corpus and the lymphoma corpus. ududA supervised machine-learning based-approach, utilising conditional random fields learners, was developed to recognise medical entities from the corpora. By feature engineering, the best feature configurations were attained, which boosted the F-scores significantly from 4.2% to 6.8% on the training sets. ududWithout proper negation and uncertainty detection, the quality of the structured reports will be diminished. The negation and uncertainty detection modules were built to handle this problem. The modules obtained overall F-scores ranging from 76.6% to 91.0% on the test sets. ududA relation extraction system was presented to extract four relations from the lymphoma corpus. The system achieved very good performance on the training set, with 100% F-score obtained by the rule-based module and 97.2% F-score attained by the support vector machines classifier.ududRule-based approaches were used to generate the structured outputs and populate them to predefined templates. The rule-based system attained over 97% F-scores on the training sets. ududA pipeline system was implemented with an assembly of all the components described above. It achieved promising results in the end-to-end evaluations, with 86.5%, 84.2% and 78.9% F-scores on the melanoma, colorectal cancer and lymphoma test sets respectively.
机译:使用结构化病理报告有许多优点:它们可以确保病理报告的准确性和完整性; ud ud本论文的目的是从自由文本病理报告中提取相关信息,并自动填充针对癌症的结构化报告,并确定处理原理的共性和差异为了获得最大的准确性。在本研究中,注释了三个病理学语料库,其中包括实体和实体之间的关系,即黑素瘤语料库,结直肠癌语料库和淋巴瘤语料库。 ud ud使用条件随机场学习器开发了一种基于监督的机器学习方法,用于识别语料库中的医疗实体。通过特征工程,可以获得最佳的特征配置,这使训练集上的F分数从4.2%显着提高到6.8%。没有适当的否定和不确定性检测,结构化报告的质量将降低。求反和不确定性检测模块旨在解决此问题。这些模块在测试集上获得的总体F分数为76.6%至91.0%。提出了一种关系提取系统,用于从淋巴瘤体中提取四个关系。该系统在训练集上取得了非常好的性能,基于规则的模块获得了100%的F得分,而支持向量机分类器获得了97.2%的F得分。基于 ud udRule的方法被用于生成结构化的输出并将其填充到预定义的模板中。基于规则的系统在训练集上获得了97%的F分数。 ud ud管道系统是由上述所有组件组成的。它在端到端评估中取得了可喜的结果,在黑色素瘤,结直肠癌和淋巴瘤测试集上的F分数分别为86.5%,84.2%和78.9%。

著录项

  • 作者

    Ou Ying;

  • 作者单位
  • 年度 2015
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号