首页> 外文会议>International Conference on Natural Language Processing and Chinese Computing >Cross-Lingual Entity Matching for Heterogeneous Online Wikis
【24h】

Cross-Lingual Entity Matching for Heterogeneous Online Wikis

机译:异构在线Wiki的交叉语言实体匹配

获取原文

摘要

Knowledge bases play an increasing important role in many applications. However, many knowledge bases mainly focus on English knowledge, and have only a few knowledge for low-resource languages (LLs). If we can map the entities in LLs to these in high-resource languages (HLs), many knowledge such as relation between entities can be transferred from HLs to LLs. In this paper, we propose an efficient and effective Cross-Lingual Entity Matching approach (CL-EM) to enrich the existing cross-lingual links by learning to rank framework with the learned language-independent features, including cross-lingual topic features and document embedding features. In the experiments, we verified our approach on the existing cross-lingual links between Chinese Wikipedia and English Wikipedia by comparing it with other state-of-art approaches. In addition, we also discovered 141,754 new cross-lingual links between Baidu Baike and English Wikipedia, which almost doubles the number of the existing cross-lingual links.
机译:知识库在许多应用中发挥着越来越重要的重要作用。然而,许多知识库主要关注英语知识,并且只有几个知识对低资源语言(LLS)。如果我们可以以高资源语言(HLS)将LL中的实体映射到这些中,则可以从HLS转移到LLS之间的许多知识。在本文中,我们提出了一种有效且有效的交叉语言实体匹配方法(CL-EM)来丰富现有的交叉链接,通过学习与学习的语言无关功能进行排名,包括交叉语言主题功能和文档嵌入功能。在实验中,我们通过将其与其他最先进的方法进行比较,验证了我们对中国维基百科和英语维基百科的现有交叉链接。此外,我们还发现了141,754个在百度Baike和英语维基百科之间的新交叉链接,几乎加倍现有的交叉链接的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号