首页> 外文期刊>Computational Intelligence >HIGH-PRECISION BIOLOGICAL EVENT EXTRACTION: EFFECTS OF SYSTEM AND OF DATA
【24h】

HIGH-PRECISION BIOLOGICAL EVENT EXTRACTION: EFFECTS OF SYSTEM AND OF DATA

机译:高精度生物事件提取:系统和数据的影响

获取原文
获取原文并翻译 | 示例
           

摘要

We approached the problems of event detection, argument identification, and negation and speculation detection in the BioNLP'09 information extraction challenge through concept recognition and analysis. Our methodology involved using the OpenDMAP semantic parser with manually written rules. The original OpenDMAP system was updated for this challenge with a broad ontology defined for the events of interest, new linguistic patterns for those events, and specialized coordination handling. We achieved state-of-the-art precision for two of the three tasks, scoring the highest of 24 teams at precision of 71.81 on Task 1 and the highest of 6 teams at precision of 70.97 on Task 2. We provide a detailed analysis of the training data and show that a number of trigger words were ambiguous as to event type, even when their arguments are constrained by semantic class. The data is also shown to have a number of missing annotations. Analysis of a sampling of the comparatively small number of false positives returned by our system shows that major causes of this type of error were failing to recognize second themes in two-theme events, failing to recognize events when they were the arguments to other events, failure to recognize nontheme arguments, and sentence segmentation errors. We show that specifically handling coordination had a small but important impact on the overall performance of the system. The OpenDMAP system and the rule set are available at http: / /bionlp. sourcef orge. net.
机译:通过概念识别和分析,我们解决了BioNLP'09信息提取挑战中的事件检测,参数识别以及否定和推测检测问题。我们的方法涉及将OpenDMAP语义解析器与手动编写的规则一起使用。原始的OpenDMAP系统已针对此挑战进行了更新,具有针对感兴趣事件定义的广泛本体,这些事件的新语言模式以及专门的协调处理。我们在三个任务中的两个方面都达到了最先进的精度,在任务1上以71.81的精度得分最高的24个团队,在任务2上以70.97的精度得分最高的6个团队。训练数据,并表明,即使触发词的参数受到语义类的限制,许多触发词在事件类型方面也不明确。数据还显示有许多缺少的注释。对我们的系统返回的相对较少的误报样本进行的分析表明,此类错误的主要原因是无法识别两个主题事件中的第二个主题,而当它们是其他事件的论据时则无法识别它们,无法识别非主题参数,以及句子分割错误。我们显示,专门处理协调对系统的整体性能影响很小但很重要。 OpenDMAP系统和规则集位于http:// bionlp。资料来源。净。

著录项

  • 来源
    《Computational Intelligence》 |2011年第4期|p.681-701|共21页
  • 作者单位

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

    Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    event recognition; conceptual analysis; natural language processing; text mining; BioNLP.;

    机译:事件识别;概念分析;自然语言处理;文本挖掘;BioNLP。;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号