首页> 美国卫生研究院文献>Springer Open Choice >Annotating patient clinical records with syntactic chunks and named entities: the Harvey Corpus
【2h】

Annotating patient clinical records with syntactic chunks and named entities: the Harvey Corpus

机译:使用句法块和命名实体注释患者临床记录:Harvey语料库

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning.
机译:医师在患者会诊期间键入的自由文本注释包含有关疾病和治疗研究的宝贵信息。这些注释很难被现有的自然语言分析工具处理,因为它们具有很高的电报性(省略了许多单词),并且包含许多拼写错误,标点符号不一致以及非标准单词顺序。为了支持此类文本的信息提取和分类任务,我们描述了一种自由文本注释的去识别语料库,这种文本的浅句法和命名实体注释方案,以及一种训练没有语言背景的领域专家进行注释的方法文本。最后,我们为此类临床文本提供了一种统计分块系统,具有稳定的学习率和良好的准确性,表明手动注释是一致的,并且注释方案对于机器学习而言是易于处理的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号