首页> 外文会议>IEEE International Conference on Research, Innovation and Vision for the Future >Named Entity Disambiguation on an Ontology Enriched by Wikipedia
【24h】

Named Entity Disambiguation on an Ontology Enriched by Wikipedia

机译:被命名的实体歧义在维基百科丰富的本体论

获取原文
获取外文期刊封面目录资料

摘要

Currently, for named entity disambiguation, the shortage of training data is a problem. This paper presents a novel method that overcomes this problem by automatically generating an annotated corpus based on a specific ontology. Then the corpus was enriched with new and informative features extracted from Wikipedia data. Moreover, rather than pursuing rule-based methods as in literature, we employ a machine learning model to not only disambiguate but also identify named entities. In addition, our method explores in details the use of a range of features extracted from texts, a given ontology, and Wikipedia data for disambiguation. This paper also systematically analyzes impacts of the features on disambiguation accuracy by varying their combinations for representing named entities. Empirical evaluation shows that, while the ontology provides basic features of named entities, Wikipedia is a fertile source for additional features to construct accurate and robust named entity disambiguation systems.
机译:目前,对于命名实体歧义,培训数据的短缺是一个问题。本文提出了一种新的方法,通过基于特定本体学会自动生成带注释的语料库来克服这个问题。然后,富集的语料库是从维基百科数据中提取的新功能和信息特征。此外,而不是追求基于规则的方法,如文学中,我们使用机器学习模型不仅要消除歧义,而且还识别了命名实体。此外,我们的方法详细探讨了从文本,给定本体和维基百科数据中提取的一系列功能的使用,以便消歧。本文还通过改变其代表名称实体的组合来系统地分析了特征对消歧准确度的影响。实证评估表明,虽然本体提供了命名实体的基本功能,但维基百科是一种肥沃的源,用于构建准确和强大的命名实体消歧系统的其他功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号