首页> 外文会议>Asian Semantic Web Conference >Named Entity Disambiguation: A Hybrid Statistical and Rule-Based Incremental Approach
【24h】

Named Entity Disambiguation: A Hybrid Statistical and Rule-Based Incremental Approach

机译:命名实体消除歧义:混合统计和规则的增量方法

获取原文

摘要

The rapidly increasing use of large-scale data on the Web makes named entity disambiguation become one of the main challenges to research in Information Extraction and development of Semantic Web. This paper presents a novel method for detecting proper names in a text and linking them to the right entities in Wikipedia. The method is hybrid, containing two phases of which the first one utilizes some heuristics and patterns to narrow down the candidates, and the second one employs the vector space model to rank the ambiguous cases to choose the right candidate. The novelty is that the disambiguation process is incremental and includes several rounds that filter the candidates, by exploiting previously identified entities and extending the text by those entity attributes every time they are successfully resolved in a round. We test the performance of the proposed method in disambiguation of names of people, locations and organizations in texts of the news domain. The experiment results show that our approach achieves high accuracy and can be used to construct a robust named entity disambiguation system.
机译:快速越来越多地利用网络上的大规模数据使得命名实体歧义成为对信息提取和语义网络的发展研究的主要挑战之一。本文介绍了一种用于检测文本中正确名称并将其链接到维基百科的合适实体的新方法。该方法是混合动力车,其中第一个阶段第一个阶段利用一些启发式和模式来缩小候选者,并且第二个采用矢量空间模型来对模糊的情况进行排名以选择合适的候选人。新颖性是消歧过程是增量的,并且包括通过利用先前已识别的实体来过滤候选物的几轮,并每次在一轮成功解决它们时由这些实体属性扩展文本。我们在新闻领域的文本中歧义歧义的拟议方法的表现。实验结果表明,我们的方法可实现高精度,可用于构造一个强大的名为实体消歧系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号