首页> 外文会议>International Conference on Data Mining >A Cascaded Approach to Biomedical Named Entity Recognition Using a Unified Model
【24h】

A Cascaded Approach to Biomedical Named Entity Recognition Using a Unified Model

机译:使用统一模型对生物医学命名实体识别的级联方法

获取原文

摘要

We propose a cascaded approach for extracting biomedical named entities from text documents using a unified model. Previous works often ignore the high computational cost incurred by a single-phase approach. We alleviate this problem by dividing the named entity extraction task into a segmentation task and a classification task, reducing the computational cost by an order of magnitude. A unified model, which we term "maximum-entropy margin-based" (MEMB), is used in both tasks. The MEMB model considers the error between a correct and an incorrect output during training and helps improve the performance of extracting sparse entity types that occur in biomedical literature. We report experimental evaluations on the GENIA corpus available from the BioNLP/NLPBA (2004) shared task, which demonstrate the state-of-the-art performance achieved by the proposed approach.
机译:我们提出了一种级联方法,可以使用统一模型从文本文档中提取生物医学命名实体。以前的作品通常忽略单相方法产生的高计算成本。通过将命名实体提取任务划分为分割任务和分类任务,通过将命名实体提取任务划分为分类任务来减轻此问题,从而减少了数量级的计算成本。在两个任务中使用我们术语“基于熵率的”(MEMB)的统一模型。 MEMB模型考虑训练期间正确和错误的输出之间的误差,有助于提高生物医学文献中提取出现的稀疏实体类型的性能。我们报告了从Bionlp / NLPBA(2004)共享任务中获得的Genia Corpus的实验评估,这表明了通过所提出的方法实现的最先进的绩效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号