首页> 外文会议>International Workshop on Computational Terminology >TermEval 2020: Shared Task on Automatic Term Extraction Using the Annotated Corpora for Term Extraction Research (ACTER) Dataset
【24h】

TermEval 2020: Shared Task on Automatic Term Extraction Using the Annotated Corpora for Term Extraction Research (ACTER) Dataset

机译:TermEval 2020:使用带注释的语料库进行术语提取研究(ACTER)数据集的自动术语提取的共享任务

获取原文

摘要

The TermEval 2020 shared task provided a platform for researchers to work on automatic term extraction (ATE) with the same dataset: the Annotated Corpora for Term Extraction Research (ACTER). The dataset covers three languages (English, French, and Dutch) and four domains, of which the domain of heart failure was kept as a held-out test set on which final f1 -scores were calculated. The aim was to provide a large, transparent, qualitatively annotated, and diverse dataset to the ATE research community, with the goal of promoting comparative research and thus identifying strengths and weaknesses of various state-of-the-art methodologies. The results show a lot of variation between different systems and illustrate how some methodologies reach higher precision or recall, how different systems extract different types of terms, how some are exceptionally good at finding rare terms, or are less impacted by term length. The current contribution offers an overview of the shared task with a comparative evaluation, which complements the individual papers by all participants.
机译:TermEval 2020共享任务为研究人员提供了一个平台,用于使用相同的数据集进行自动术语提取(ATE),即带注释的术语提取研究语料库(ACTER)。数据集涵盖三种语言(英语,法语和荷兰语)和四个域,其中心力衰竭域作为一种保留测试集保留下来,并在该测试集上计算最终的f1得分。目的是为ATE研究社区提供一个大型,透明,定性注释且多样化的数据集,目的是促进比较研究,从而确定各种最新方法的优点和缺点。结果表明,不同系统之间存在很大差异,并说明了某些方法如何达到更高的精度或召回率,不同的系统如何提取不同类型的术语,某些方法在发现稀有术语方面异常擅长,或者受术语长度影响较小。本文稿通过比较评估的方式概述了共享任务,补充了所有参与者的个人论文。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号