【24h】

THU_NGN at SemEval-2019 Task 12: Toponym Detection and Disambiguation on Scientific Papers

机译:THU_NGN在SemEval-2019上的任务12:科学论文的地名检测和消歧

获取原文

摘要

Toponym resolution is an important and challenging task in the neural language processing field, and has wide applications such as emergency response and social media geographical event analysis. Toponym resolution can be roughly divided into two independent steps, i.e., toponym detection and toponym disambiguation. In order to facilitate the study on toponym resolution, the SemEval 2019 task 12 is proposed, which contains three subtasks, i.e., toponym detection, toponym disambiguation and toponym resolution. In this paper, we introduce our system that participated in the SemEval 2019 task 12. For toponym detection, in our approach we use TagLM as the basic model, and explore the use of various features in this task, such as word embeddings extracted from pre-trained language models, POS tags and lexical features extracted from dictionaries. For toponym disambiguation, we propose a heuristics rule-based method using toponym frequency and population. Our systems achieved 83.03% strict macro F1, 74.50 strict micro F1, 85.92 overlap macro F1 and 78.47 overlap micro F1 in toponym detection subtask.
机译:地名解析是神经语言处理领域中一项重要且具有挑战性的任务,具有广泛的应用,例如紧急响应和社交媒体地理事件分析。地名解析可以大致分为两个独立的步骤,即地名检测和地名消歧。为了促进对地名解析的研究,提出了SemEval 2019任务12,该任务包含三个子任务,即地名检测,地名消歧和地名解析。在本文中,我们介绍了参与SemEval 2019任务12的系统。对于地名检测,在我们的方法中,我们使用TagLM作为基本模型,并探讨了此任务中各种功能的使用,例如从preem中提取的单词嵌入训练有素的语言模型,POS标签和从词典中提取的词汇特征。对于地名歧义消除,我们提出了一种使用地名频率和总体的基于启发式规则的方法。我们的系统在地名检测子任务中实现了83.03%严格宏F1、74.50严格微F1、85.92重叠宏F1和78.47重叠微F1。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号