首页> 外文会议>16th workshop on biomedical natural language processing >Proactive Learning for Named Entity Recognition
【24h】

Proactive Learning for Named Entity Recognition

机译:主动学习以进行命名实体识别

获取原文
获取原文并翻译 | 示例

摘要

The goal of active learning is to minimise the cost of producing an annotated dataset, in which annotators are assumed to be perfect, i.e., they always choose the correct labels. However, in practice, annotators are not infallible, and they are likely to assign incorrect labels to some instances. Proactive learning is a generalisation of active learning that can model different kinds of annotators. Although proactive learning has been applied to certain labelling tasks, such as text classification, there is little work on its application to named entity (NE) tagging. In this paper, we propose a proactive learning method for producing NE annotated corpora, using two annotators with different levels of expertise, and who charge different amounts based on their levels of experience. To optimise both cost and annotation quality, we also propose a mechanism to present multiple sentences to annotators at each iteration. Experimental results for several corpora show that our method facilitates the construction of high-quality NE labelled datasets at minimal cost.
机译:主动学习的目标是最大程度地减少产生带注释的数据集的成本,在该数据集中假定带注释者是完美的,即他们总是选择正确的标签。但是,实际上,注释器并非绝对可靠,并且它们可能会为某些实例分配不正确的标签。主动学习是主动学习的概括,可以对不同类型的注释器进行建模。尽管主动学习已应用于某些标记任务(例如文本分类),但是将其应用于命名实体(NE)标记的工作很少。在本文中,我们提出了一种使用两种具有不同专业知识水平并且根据经验水平收取不同金额的注释者来生产NE注释语料库的主动学习方法。为了优化成本和注释质量,我们还提出了一种在每次迭代时向注释者显示多个句子的机制。几种语料库的实验结果表明,我们的方法有助于以最低的成本构建高质量的NE标记数据集。

著录项

  • 来源
  • 会议地点 Vancouver(CA)
  • 作者单位

    National Centre for Text Mining School of Computer Science, The University of Manchester, United Kingdom;

    National Centre for Text Mining School of Computer Science, The University of Manchester, United Kingdom;

    National Centre for Text Mining School of Computer Science, The University of Manchester, United Kingdom;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号