首页> 外文会议>Machine translation summit >MAGMATic: A Multi-domain Academic Gold Standard with Manual Annotation of Terminology for Machine Translation Evaluation
【24h】

MAGMATic: A Multi-domain Academic Gold Standard with Manual Annotation of Terminology for Machine Translation Evaluation

机译:MAGMATic:一种多领域的学术金标准,带有用于机器翻译评估的人工术语注释

获取原文

摘要

This paper presents MAGMATic (Multi-domain Academic Gold Standard with Manual Annotation of Terminology), a novel Italian-English benchmark which allows MT evaluation focused on terminology translation. The data set comprises 2,056 parallel sentences extracted from institutional academic texts, namely course unit and degree program descriptions. This text type is particularly interesting since it contains terminology from multiple domains, e.g. education and different academic disciplines described in the texts. All terms in the English target side of the data set were manually identified and annotated with a domain label, for a total of 7,517 annotated terms. Due to their peculiar features, institutional academic texts represent an interesting test bed for MT. As a further contribution of this paper, we investigate the feasibility of exploiting MT for the translation of this type of documents. To this aim, we evaluate two state-of-the-art Neural MT systems on MAGMATic, focusing on their ability to translate domain-specific terminology.
机译:本文介绍了MAGMATic(带有术语手动注释的多领域学术金标准),这是一种新颖的意大利语-英语基准,可让MT评估专注于术语翻译。数据集包括从机构学术课本中提取的2056条平行句子,即课程单位和学位课程说明。这种文本类型特别有趣,因为它包含来自多个域的术语,例如教科书中描述的教育和不同学科。手动识别了数据集英语目标侧中的所有术语,并使用域标签进行了注释,总共有7,517个带注释的术语。由于其独特的功能,机构学术课本代表了MT的一个有趣的测试平台。作为本文的进一步贡献,我们研究了利用MT来翻译此类文档的可行性。为此,我们评估了MAGMATic上的两个最新的神经MT系统,重点是它们翻译特定领域术语的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号